From stefano at kant.irmkant.rm.cnr.it  Mon Feb  1 03:41:40 1993
From: stefano at kant.irmkant.rm.cnr.it (stefano@kant.irmkant.rm.cnr.it)
Date: Mon, 1 Feb 1993 02:41:40 -0600
Subject: No subject
Message-ID: <9302010841.AA11465@kant.irmkant.rm.cnr.it>


The following paper has been placed in the neuroprose archive 
as nolfi.self-sel.ps.Z  
Instructions for retrieving and printing follow the abstract.


           Self-selection of Input Stimuli for Improving Performance


                  Stefano Nolfi              Domenico Parisi
                        Institute of Psychology, CNR
                        V.le Marx 15, 00137 Rome - Italy
           E-mail: stiva at irmkant.Bitnet  domenico at irmkant.Bitnet


                                  Abstract

  A system which behaves in an environment  can increase its performance 
  level  in  two  different ways.  It  can improve  its ability to react 
  efficiently to any stimulus that may come  from  the environment or it 
  can acquire an ability to expose itself only to a sub-class of stimuli 
  to which it knows how  to  respond efficiently. The possibility that a 
  system can solve  a  task  by  selecting favourable stimuli  is rarely 
  considered  in  designing intelligent systems. In  this paper  we show 
  that this type of ability can play a very powerful role  in explaining 
  a system's performance. 


The paper has been published in: G. A. Bekey (1993), Neural Networks and 
Robotics, Kluwer Academic Publisher.

Sorry, no hard copies are available

Comments are welcome.


Stefano Nolfi
Institute of Psychology, CNR
V.le Marx, 15
00137 - Rome - Italy
email stiva at irmkant.Bitnet
_______________________________________________________________________

Here is an example of how to retrieve this file:

gvax> ftp archive.cis.ohio-state.edu        (or ftp 128.146.8.52)
Connected to archive.cis.ohio-state.edu.
220 archive.cis.ohio-state.edu FTP server ready.
Name: anonymous
331 Guest login ok, send ident as password.
Password:neuron at wherever
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get nolfi.self-sel.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for nolfi.self-sel.ps.Z
226 Transfer complete.
ftp> quit
221 Goodbye.
gvax> uncompress nolfi.self-sel.ps.Z
gvax> lpr nolfi.self-sel.ps


From sbcho%gorai.kaist.ac.kr at DAIDUK.KAIST.AC.KR  Tue Feb  2 11:31:50 1993
From: sbcho%gorai.kaist.ac.kr at DAIDUK.KAIST.AC.KR (Sung-Bae Cho)
Date: Tue, 2 Feb 93 11:31:50 KST
Subject: Paper Announcement
Message-ID: <9302020231.AA01990@gorai.kaist.ac.kr.noname>


Feedforward Neural Network Architectures for
Complex Classification Problems 

To appear in the Fuzzy Systems & AI journal (Romanian Academia Publishing
House). The idea of this paper was presented at the 2nd International
Conference on Fuzzy Logic & Neural Networks, Iizuka-92.

Sung-Bae Cho (sbcho at gorai.kaist.ac.kr) and Jin H. Kim

Center for Artificial Intelligence Research and Computer Science Department
Korea Advanced Institute of Science and Technology
373-1, Koosung-dong, Yoosung-ku, Taejeon 305-701, Republic of Korea

Abstract

This paper presents two neural network design strategies for incorporating
a priori knowledge about a given problem into the feedforward neural
networks. These strategies aim at obtaining tractability and reliability
for solving complex classification problems by neural networks.
The first type strategy based on multistage scheme decomposes the problem
into manageable ones for reducing the complexity of the problem,
and the second type strategy on multiple network scheme combines incomplete
decisions from several copies of networks for reliable decision-making.
A preliminary experiment of recognizing on-line handwriting characters
confirms the superiority relative to a single large neural network classifier.

Key words: neural network architecture design, multistage neural network,
multiple neural networks, synthesis method, voting method, expert judgement,
handwriting character recognition

-----
Now available in the neuroprose archive:
  archive.cis.ohio-state.edu (128.146.8.52)
  pub/neuroprose directory
under the file name
  sbcho.nn_architects.ps.Z
(compressed PostScript).


From ro2m at crab.psy.cmu.edu  Mon Feb  1 11:34:06 1993
From: ro2m at crab.psy.cmu.edu (Randall C. O'Reilly)
Date: Mon, 1 Feb 93 11:34:06 EST
Subject: 2 pdp.cns TR's available
Message-ID: <9302011634.AA06379@crab.psy.cmu.edu.noname>


The following two (related) TR's are now available for electronic ftp
or by hardcopy.  Instructions follow the abstracts.

>>> NOTE THAT THE FTP SITE IS OUR OWN, NOT NEUROPROSE <<<


	Object Recognition and Sensitive Periods: A Computational 
		     Analysis of Visual Imprinting
			
	  		  Randall C. O'Reilly

			    Mark H. Johnson

		      Technical Report PDP.CNS.93.1

		    (Submitted to Neural Computation)

Abstract:

Evidence from a variety of methods suggests that a localized portion
of the domestic chick brain, the Intermediate and Medial Hyperstriatum
Ventrale (IMHV), is critical for filial imprinting.  Data further
suggest that IMHV is performing the object recognition component of
imprinting, as chicks with IMHV lesions are impaired on other tasks
requiring object recognition.  We present a neural network model of
translation invariant object recognition developed from computational
and neurobiological considerations that incorporates some features of
the known local circuitry of IMHV.  In particular, we propose that the
recurrent excitatory and lateral inhibitory circuitry in the model,
and observed in IMHV, produces hysteresis on the activation state of
the units in the model and the principal excitatory neurons in IMHV.
Hysteresis, when combined with a simple Hebbian covariance learning
mechanism, has been shown in earlier work to produce translation
invariant visual representations.  To test the idea that IMHV might be
implementing this type of object recognition algorithm, we have used a
simple neural network model to simulate a variety of different
empirical phenomena associated with the imprinting process.  These
phenomena include reversibility, sensitive periods, generalization,
and temporal contiguity effects observed in behavioral studies of
chicks.  In addition to supporting the notion that these phenomena,
and imprinting itself, result from the IMHV properties captured in the
simplified model, the simulations also generate several predictions
and clarify apparent contradictions in the behavioral data.

-----------------------------------------------------------------------


	The Self-Organization of Spatially Invariant Representations
			
	  		  Randall C. O'Reilly

			  James L. McClelland

		     Technical Report PDP.CNS.92.5

Abstract:

The problem of computing object-based visual representations can be
construed as the development of invariancies to visual dimensions
irrelevant for object identity.  This view, when implemented in a
neural network, suggests a different set of algorithms for computing
object-based visual representations than the ``traditional'' approach
pioneered by Marr, 1981.  A biologically plausible
self-organizing neural network model that develops spatially invariant
representations is presented.  There are four features of the
self-organizing algorithm that contribute to the development of
spatially invariant representations: temporal continuity of
environmental stimuli, hysteresis of the activation state (via
recurrent activation loops and lateral inhibition in an interactive
network), Hebbian learning, and a split pathway between ``what'' and
``where'' representations.  These constraints are tested with a
backprop network, which allows for the evaluation of the individual
contributions of each constraint on the development of spatially
invariant representations.  Subsequently, a complete model embodying a
modified Hebbian learning rule and interactive connectivity is
developed from biological and computational considerations.  The
activational stability and weight function maximization properties of
this interactive network are analyzed using a Lyapunov function
approach.  The model is tested first on the same simple stimuli used
in the backprop simulation, and then with a more complex environment
consisting of right and left diagonal lines.  The results indicate
that the hypothesized constraints, implemented in a Hebbian network,
were capable of producing spatially invariant representations.
Further, evidence for the gradual integration of both featural
complexity and spatial invariance over increasing layers in the
network, thought to be important for real-world applications, was
obtained.  As the approach is generalizable to other dimensions such
as orientation and size, it could provide the basis of a more complete
biologically plausible object recognition system.  Indeed, this work
forms the basis of a recent model of object recognition in the
domestic chick (O'Reilly & Johnson, 1993, TR PDP.CNS.93.1).

-----------------------------------------------------------------------


Retrieval information for pdp.cns TRs:

unix> ftp 128.2.248.152                 # hydra.psy.cmu.edu
Name: anonymous
Password: <email address>
ftp> cd pub/pdp.cns
ftp> binary
ftp> get pdp.cns.93.1.ps.Z		# or, and
ftp> get pdp.cns.92.5.ps.Z
ftp> quit
unix> zcat pdp.cns.93.1.ps.Z | lpr 	# or however you print postscript
unix> zcat pdp.cns.92.5.ps.Z | lpr 

For those who do not have FTP access, physical copies can be requested from
Barbara Dorney <bd1q+ at andrew.cmu.edu>.


From tresp at inf21.zfe.siemens.de  Tue Feb  2 12:29:10 1993
From: tresp at inf21.zfe.siemens.de (Volker Tresp)
Date: Tue, 2 Feb 1993 18:29:10 +0100
Subject: paper in neuroprose
Message-ID: <199302021729.AA24088@inf21.zfe.siemens.de>


The following paper has been placed in the neuroprose archive 
as  tresp.rules.ps.Z
Instructions for retrieving and printing follow the abstract.


-----------------------------------------------------------------

NETWORK STRUCTURING AND TRAINING USING RULE-BASED KNOWLEDGE

-----------------------------------------------------------------


Volker Tresp, 		Siemens, Central Research

Juergen Hollatz, 	TU Muenchen

Subutai Ahmad, 		Siemens, Central Research


Abstract


We  demonstrate in this paper how certain forms of rule-based 
knowledge can be used to prestructure a  neural network of
normalized basis functions  and give a probabilistic
 interpretation of the  network architecture.  We describe several 
ways to assure that rule-based knowledge is  preserved during 
training and  present a method for complexity reduction that
tries to minimize the number of rules
and the number of conjuncts. After training,
the refined rules are extracted and analyzed. 


To appear in:

S. J. Hanson, J. D. Cowan, and C. L. Giles (Eds.), Advances in Neural
Information Processing Systems 5. San Mateo CA: Morgan Kaufmann.


----
Volker Tresp
Siemens AG, Central Research,   		Phone: 	+49 89 636-49408
Otto-Hahn-Ring 6,                            	FAX: 	+49 89 636-3320
W-8000 Munich 83, Germany           	  	E-mail: tresp at zfe.siemens.de


     unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
     Name: anonymous
     Password: neuron
     ftp> cd pub/neuroprose
     ftp> binary
     ftp> get tresp.rules.ps.Z
     ftp> quit
     unix> uncompress  tresp.rules.ps.Z
     unix> lpr -s tresp.rules.ps   (or however you print postscript)


From denis at psy.ox.ac.uk  Wed Feb  3 10:47:36 1993
From: denis at psy.ox.ac.uk (Denis Mareschal)
Date: Wed, 3 Feb 93 15:47:36 GMT
Subject: visual tracking
Message-ID: <9302031547.AA09779@dragon.psych.pdp>

Hi,

	A couple of months ago I sent around a request for further information
concerning higher level connectionist approaches to the development of
visual tracking. I received a number of replies spanning the broad range of 
fields in which neural network research is being conducted.

	I also received a significant number of requests for the resulting
compiled list of references. I am thus posting a list of references resulting 
directly and indirectly from my original request. I have also included a few
relevant psychology review papers.

	Thanks to all those who replied. Clearly this list is not exhaustive
and if anyone reading it notices an ommission which may be of interest I 
would greatly appreciate hearing from them.

			Cheers,
	
				Denis Mareschal
				Department of Experimental Psychology
				South Parks Road
				Oxford University
				Oxford   OX1 3UD
				maresch at black.ox.ac.uk


REFERENCES:


Allen, R. B. (1988), Sequential connectionist networks for answering simple
	questions about a microworld. In: Proceedings of the Tenth Annual 
	Conference of the Cognitive Science Society, pp. 489-495, Hillsdale,
	NJ: Erlbaum.

Baloch, A. A. & Waxman A. M. (1991). Visual learning, adaptive expectations
	and behavioral conditioning of the mobile robot MAVIN, Neural Networks,
	vol. 4, pp. 271-302.

Buck, D. S. & Nelson D. E. (1992). Applying the abductory induction mechanism
	(AIM) to the extrapolation of chaotic time series. In: Proceedings of
	the National Aerospace Electronics Conference (NAECON), 18-22 May,
	Dayton, Ohio, vol. 3, pp 910-915.

Bremner, J. G. (1985). Object tracking and search in infancy: A review of data
	and a theoretical evaluation, Developmental Review, 5, pp.  371-396

Carpenter, G. A. & Grossberg, S. (1992). Neural Networks for Vision and Image
	 Processing, Cambridge, MA: MIT Press.

Cleermans, A., Servan-Schreiber, D. & McClelland, J. L. (1989). Finite state
	automata and simple recurrent networks, Neural Computation,1, pp 372-
	381.

Deno, D. C., Keller, E. L. & Crandall, W. F. (1989). Dynamical neural network
	organization of the visual pursuit system, IEEE Transactions on
	Biomedical Engineering, vol. 36, pp. 85-91.

Dobnikar, A., Likar, A. & Podbregar, D. (1989). Optimal visual tracking with
	artificial neural network. In: First I.E.E. International Conference
	on Artificial Neural Networks (conf. Publ. 313), pp 275-279.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, pp.
	179-211.

Ensley, D. & Nelson, D. E. (1992). Applying Cascade-correlation to the 
	extrapolation of chaotic time series. Proceedings of the Third
	Workshop on Neural Networks: Academic/Industrial/NASA/Defense;
	10-12 February, Auburn, Alabama.

Fay, D. A. & Waxman, A. M. (1992). Neurodynamics of real-time image velocity
	extraction. In: G. A. Carpenter & S. Grossberg (Eds), Neural Networks
	for Vision and Image Processing, pp 221-246, Cambridge, MA: MIT Press.

Gordon, Steele, & Rossmiller (1991). Predicting trajectories using recurrent
	neural networks. In: Dagli, Kumara, & Shin (Eds), Intelligent Systems
	Through Artificial Neural Networks, ASME Press. (Sorry that's the best
	I can do for this reference)

Grossberg, S. & Rudd(1989). A neural architecture for visual motion perception:
	Neural Networks, 2, pp. 421-450.	

Koch, C. & Ullman, S. (1985). Shifts in selective visual attention: towards
	the underlying neural circuitry. Human Neurobiology, 4, pp. 219-227.

Lisberger, S. G., Morris, E. J. & Tychsen, L. (1987). Visual motion processing
	and sensory-motor integration for smooth pursuit eye movements,
	Annual Review of Neuroscience, 10, pp. 97-129.

Lumer, E., D. (1992). The phase tracker of attention. In: Proceedings of the 
	Fourteenth Annual Conference of the Cognitive Science Society, pp
	962-967, Hillsdale, NJ: Erlbaum.

Neilson,P. D., Neilson, M. D. & O'Dwyer, N. J. (1993, in press). What limits
	high speed tracking performance?, Human Mouvement Science, 12.

Nelson, D. E., Ensley, D. D. & Rogers, S. K. (1992). Prediction of chaotic time
	series using Cascade Correlation: Effects of number of inputs and
	training set size. In: The Society for Optical Engineering (SPIE),
	Proceeedings of the Applications of Artificial Neural Networks III
	Conference, 21-24 April, Orlando, Florida, vol. 1709, pp 823-829.

Marshall, J. A. (1990). Self-organizing neural networks for perception of
	visual motion, Neural Networks, 3, pp. 45-74.

Martin, W. N. & Aggarwal, J. K. (Eds) (1988). Motion Understanding: Robot
	and Human Vision. Boston: Kluwer Academic Publishers.

Metzgen, Y. & Lehmann D. (1990). Learning temporal sequences by local synaptic
	 changes, Network, 1, pp. 271-302.

Nakayama, K. (1985). Biological image motion processing: A review. Vision
	Research 25, pp 625-660.

Parisi, D., Cecconi, F. & Nolfi, S. (1990). Econets: Neural networks that learn
	in an environment, Network, 1, pp. 149-168.

Pearlmutter, B. A. (1989). Learning state space trajectories in recurrent 
	networks, Neural Computation, 1, pp. 263-269.

Regier, T. (1992). The acquisition of lexical semantics for spatial terms:
	A connectionist model of perceptual categorization. International
	Computer Science Institute (ICSI) Technical Report TR-92-062, Berkely.

Schmidhuber, J. & Huber, R. (1991). Using adaptive sequential neurocontrol
	for efficient learning of translation and rotation invariance. In:
	T. Kohonen, K. Makisara, O. Simula & J. Kangas (Eds), Artificial
	Neural Networks, pp 315-320, North Holland: Elsevier Science.

Schmidhuber, J. & Huber, R. (1991). Learning to generate artificial foveal 
	trajectories for target detection. International Journal of Neural
	Systems, 2, pp. 135-141.

Schmidhuber, J. & Wahnsiedler, R. (1992). Planning simple trajectories using
	neural subgoal generators. Second International Conference on
	Simulations of Adaptive Behavior (SAB92). (Available by ftp from Jordan
	Pollack's Neuroprose Archive).

Sereno, M. E. (1986). Neural network model for the measurement of visual
	motion. Journal of the Optical Sociaty of America A, 3, pp 72.

Sereno, M. E. (1987). Implementing stages of motion analysis in neural.
	Program of the Ninth Annual Conference of the Cognitive Science 
	Society, pp. 405-416, Hillsdala, NJ: Erlbaum.

Servan-Schreiber, D., Cleermans, A. & McClelland, J. L. (1991). Graded state
	machines: The representation of temporal contingencies in simple 
	recurrent networks, 7, pp. 161-193.

Shimohara, K., Uchiyama T. & Tokunaya Y. (1988). Back propagation networks for
	event-driven temporal sequence processing. In: IEEE International
	Conference on Neural Networks (San Diego), vol. 1, pp. 665-672, NY:
	IEEE.

Sutton, R. S. (1988). Learning to predict by the methods of temporal 
	differences, Machine Learning, 3, pp 9-44.

Tolg, S. (1991). A biological motivated system to track moving objectas by
	active camera control. In:T. Kohonen, K. Makisara, O. Simula & J.
	 Kangas (Eds), Artificial Neural Networks, pp 1237-1240, North Holland:
	 Elsevier Science. 

Wechsler, H. (Ed) (1991). Neural Networks for Human and Machine Perception,
	New York: Academic Press.


From gluck at pavlov.rutgers.edu  Wed Feb  3 09:13:20 1993
From: gluck at pavlov.rutgers.edu (Mark Gluck)
Date: Wed, 3 Feb 93 09:13:20 EST
Subject: Preprint: Computational Models of the Neural Bases of Learning and Memory
Message-ID: <9302031413.AA24540@james.rutgers.edu>


For (hard copy) preprints of the following article:

          
  Gluck, M. A. & Granger, R. C. (1993). Computational models of the neural
     bases of learning and memory. Annual Review of Neuroscience. 16: 667-706

ABSTRACT: Advances in computational analyses of parallel-processing
have made computer simulation of learning systems an increasingly useful tool
in understanding complex aggregate functional effects of changes in neural 
systems.  In this article, we review current efforts to develop computational
models of the neural bases of learning and memory, with a focus on the behavioral
implications of network-level characterizations of synaptic change in three
anatomical regions: olfactory (piriform) cortex, cerebellum, and the 
hippocampal formation.

                     ____________________________________

Send US-mail address to: Mark Gluck (Center for Neuroscience, Rutgers-Newark)
                         gluck at pavlov.rutgers.edu


From robtag at udsab.dia.unisa.it  Wed Feb  3 13:22:31 1993
From: robtag at udsab.dia.unisa.it (Tagliaferri Roberto)
Date: Wed, 3 Feb 1993 19:22:31 +0100
Subject: course on Hybrid Systems
Message-ID: <199302031822.AA08460@udsab.dia.unisa.it>


****************    IIASS 1993 February Courses   **************
****************        Last Announcement         **************


A short course on "Hybrid Systems: Neural Nets, Fuzzy Sets and
                   A.I. Systems"

 February 9 - 12

Lecturers:
Dr. Silvano Colombano, NASA Research Center, CA
Prof. Piero Morasso, Univ. Genova, Italia

-----------------------------------------------------------------
Dr. Silvano Colombano
(4 hours)
Introduction: extending the representational power of connectionism
The interim approach: hybrid symbolic connectionist systems
- Distributed
- Localist
- Mixed localist and distributed

(3 hours)
Hybrid Fuzzy Logic connectionist systems
- Classification
- Control
- Reasoning

(2 hours)
A competing approach: classifier systems
Future directions

Prof. Piero Morasso
(2 hours)
Self-organizing Systems and Hybrid Systems

Course schedule

February 9 
3 pm - 6 pm Dr. S. Colombano

February 10
3 pm - 6 pm Dr. S. Colombano

February 11
3 pm - 6 pm Dr. S. Colombano

February 12
3 pm - 5 pm Prof. P. Morasso

The course will be held at IIASS, via G. Pellegrino, Vietri s/m (Sa) Italia.
Participants will pay their own fare and travel expenses. No fees to be
payed.


The short course is sponsored by Progetto Finalizzato CNR "Sistemi 
Informatici e Calcolo Parallelo" and by Contratto quinquennale CNR-IIASS


For any information for the short course, please
contact the IIASS secretariat


                    I.I.A.S.S
                    Via G.Pellegrino, 19
                    I-84019 Vietri Sul Mare (SA)
                    ITALY

                        Tel. +39 89 761167
                        Fax  +39 89 761189

or Dr. Roberto Tagliaferri
                        E-Mail robtag at udsab.dia.unisa.it


From uli at ira.uka.de  Thu Feb  4 12:06:41 1993
From: uli at ira.uka.de (Uli Bodenhausen)
Date: Thu, 04 Feb 93 18:06:41 +0100
Subject: new papers in the neuroprose archive
Message-ID: <mailman.573.1149540255.24850.connectionists@cs.cmu.edu>

The following papers have been placed in the neuroprose archive as  

	bodenhausen.application_oriented.ps.Z
	bodenhausen.architectural_learning.ps.Z

Instructions for retrieving and printing follow the abstracts.

1.)

CONNECTIONIST ARCHITECTURAL LEARNING FOR HIGH PERFORMANCE 
CHARACTER AND SPEECH RECOGNITION

Ulrich Bodenhausen and Stefan Manke

University of Karlsruhe and Carnegie Mellon University

Highly structured neural networks like the Time-Delay Neural 
Network (TDNN) can achieve very high recognition accuracies 
in real world applications like handwritten character and speech 
recognition systems. Achieving the best possible performance 
greatly depends on the optimization of all structural parameters 
for the given task and amount of training data. We propose an 
Automatic Structure Optimization (ASO) algorithm that avoids 
time-consuming manual optimization and apply it to Multi State 
Time-Delay Neural Networks, a recent extension of the TDNN. 
We show that the ASO algorithm can construct efficient architec
tures in a single training run that achieve very high recognition 
accuracies for two handwritten character recognition tasks and 
one speech recognition task. (only 4 pages!)

To appear in the proceedings of the International Conference on 
Acoustics, Speech and Signal Processing (ICASSP) 93, Minneapolis

--------------------------------------------------------------------------
2.)

Application Oriented Automatic Structuring of Time-Delay Neural Networks for 
High Performance Character and Speech Recognition

Ulrich Bodenhausen and Alex Waibel

University of Karlsruhe and Carnegie Mellon University

Highly structured artificial neural networks have been shown 
to be superior to fully connected networks for real-world 
applications like speech recognition and handwritten character 
recognition. These structured networks can be optimized 
in many ways, and have to be optimized for optimal performance. 
This makes the manual optimization very time consuming. 
A highly structured approach is the Multi State 
Time Delay Neural Network (MSTDNN) which uses shifted 
input windows and allows the recognition of sequences of 
ordered events that have to be observed jointly. In this paper 
we propose an Automatic Structure Optimization (ASO) 
algorithm and apply it to MSTDNN type networks. The 
ASO algorithm optimizes all relevant parameters of 
MSTDNNs automatically and was successfully tested with 
three different tasks and varying amounts of training data.
(6 pages, more detailed than the first paper)

To appear in the ICNN 93 proceedings, San Francisco.

--------------------------------------------------------------------------

     unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
     Name: anonymous
     Password: neuron
     ftp> cd pub/neuroprose
     ftp> binary
     ftp> get bodenhausen.application_oriented.ps.Z
     ftp> get bodenhausen.architectural_learning.ps.Z
     ftp> quit
     unix> uncompress  bodenhausen.application_oriented.ps.Z
     unix> uncompress  bodenhausen.architectural_learning.ps.Z
     unix> lpr -s  bodenhausen.application_oriented.ps  (or however you print postscript)
     unix> lpr -s  bodenhausen.architectural_learning.ps

Thanks to  Jordan  Pollack for providing this service!


From moody at chianti.cse.ogi.edu  Thu Feb  4 20:38:08 1993
From: moody at chianti.cse.ogi.edu (John Moody)
Date: Thu, 4 Feb 93 17:38:08 -0800
Subject: NATO ASI:  March 5 Deadline Approaching
Message-ID: <9302050138.AA00659@chianti.cse.ogi.edu>


As the March 5th application deadline is now four weeks away, I am posting this 

notice again.


                  NATO Advanced Studies Institute (ASI) on 

                       Statistics and Neural Networks

                  June 21 - July 2, 1993, Les Arcs, France

Directors:
Professor Vladimir Cherkassky, Department of Electrical Eng., University of 

Minnesota, Minneapolis, MN  55455, tel.(612)625-9597, fax (612)625-
4583, email cherkass at ee.umn.edu
Professor Jerome H. Friedman, Statistics Department, Stanford University, 

Stanford, CA 94309 tel(415)723-9329, fax(415)926-3329, email 

jhf at playfair.stanford.edu
Professor Harry Wechsler, Computer Science Department, George Mason 

University, Fairfax VA22030, tel(703)993-1533, fax(703)993-1521, email 

wechsler at gmuvax2.gmu.edu

List of invited lecturers:  I. Alexander, L. Almeida, A. Barron, A. Buja, E. 

Bienenstock, G. Carpenter, V. Cherkassky, T. Hastie, F. Fogelman, J. 

Friedman, H. Freeman, F. Girosi, S. Grossberg, J. Kittler, R. Lippmann, J. 

Moody, G. Palm, R. Tibshirani, H. Wechsler, C. Wellekens

Objective, Agenda and Participants:  Nonparametric estimation is a problem 

of fundamental importance for many applications involving pattern 

classification and discrimination. This problem has been addressed in 

Statistics, Pattern Recognition, Chaotic Systems Theory, and more recently 

in Artificial Neural Network (ANN) research. This ASI will bring together 

leading researchers from these fields to present an up-to-date review of the 

current state-of-the art, to identify fundamental concepts and trends for 

future development, to assess the relative advantages and limitations of 

statistical vs neural network techniques for various pattern recognition 

applications, and to develop a coherent framework for the joint study of 

Statistics and ANNs. Topics range from theoretical modeling and adaptive 

computational methods to empirical comparisons between statistical and 

neural network techniques. Lectures will be presented in a tutorial manner to 

benefit the participants of ASI. A two-week programme is planned, complete 

with lectures, industrial/government sessions, poster sessions and social 

events. It is expected that over seventy students (which can be researchers or 

practitioners at the post-graduate or graduate level) will attend, drawn from 

each NATO country and from Central and Eastern Europe. The proceedings 

of ASI will be published by Springer-Verlag.

Applications: Applications for participation at the ASI are sought. 

Prospective students, industrial or government participants should send a 

brief statement of what they intend to accomplish and what form their 

participation would take. Each application should include a curriculum vitae, 

with a brief summary of relevant scientific or professional accomplishments, 

and a documented statement of financial need (if funds are applied for). 

Optionally, applications may include a one page summary for making a 

short presentation at the poster session. Poster presentations focusing on 

comparative evaluation of statistical and neural network methods and 

application studies are especially sought. For junior applicants, support 

letters from senior members of the professional community familiar with the 

applicant's work would strengthen the application. Prospective participants 

from Greece, Portugal and Turkey are especially encouraged to apply.

Costs and Funding: The estimated cost of hotel accommodations and meals 

for the two-week duration of the ASI is US$1,600. In addition, participants 

from industry will be charged an industrial registration fee, not to exceed 

US$1,000. Participants representing industrial sponsors will be exempt 

from the fee. We intend to subsidize costs of participants to the maximum 

extent possible by  available funding.  Prospective participants should also 

seek support from their national scientific funding agencies. The agencies, 

such as the American NSF or the German DFG, may provide some ASI 

travel funds upon the recommendation of an ASI director. Additional funds 

exist for students from Greece, Portugal and Turkey. We are also seeking 

additional sponsorship of ASI. Every sponsor will be fully acknowledged at 

the ASI site as well as in the printed proceedings. 


Correspondence and Registration:  Applications  should be forwarded to 

Dr. Cherkassky at the above address. Applications arriving after March 5, 

1993 may not be considered. All approved applicants will be informed of the 

exact registration arrangements. Informal email inquiries can be addressed to 

Dr. Cherkassky at   nato_asi at ee.umn.edu 


From takagi at diva.berkeley.edu  Thu Feb  4 21:48:15 1993
From: takagi at diva.berkeley.edu (Hideyuki Takagi)
Date: Thu, 4 Feb 93 18:48:15 -0800
Subject: BISC Special Seminar
Message-ID: <9302050248.AA02922@diva.Berkeley.EDU>

Dear Colleagues:

We will hold the BISC Special Seminar at UC Berkeley one day before
FUZZ-IEEE'93/ICNN'93. Please forward the following announcement to widely.

Hideyuki TAKAGI


-----------------------------------------------------------------------

                 EXTENDED BISC SPECIAL SEMINAR

	     10:30AM-5:45PM, March 28 (Sunday), 1993
	     Sibley Auditorium (210) in Bechtel Hall
	   University of California, Berkeley CA 94720


BISC (Berkeley Initiative for Soft Computing) of UC Berkeley will hold
a Special Seminar to take advantage of the presence in the San Francisco
area of the luminaries attending FUZZ-IEEE'93/ICNN'93. We hope that your
schedule will allow you to participate. 


PROGRAM:

10:30-11:00 Lotfi A. Zadeh (Univ. of California, Berkeley)
              Soft Computing
11:00-12:00 Hidetomo Ichihashi / Univ. of Osaka Prefecture
              Neuro-Fuzzy Approaches to Optimization and Inverse Problems
12:00- 1:30        (lunch)
 1:30- 2:30 Philippe Smets (Iridia Universite Libre de Bruxelles)
              Imperfect information : Imprecision - Uncertainty
 2:30- 3:30 Teuvo Kohonen (Helsinki University of Technology)
              Competitive-Learning Neural Networks are closest to Biology
 3:30- 3:45        (break)
 3:45- 4:45 Michio Sugeno (Tokyo Institute of Technology)
              Fuzzy Modeling towards Qualitative Modeling
 4:45- 5:45 Hugues Bersini (Iridia Universite Libre de Bruxelles)
              The Immune Learning Mechanisms: Reinforcement, Recruitment
              and their Applications


REGISTRATION:

Attendance is free and registration is not required.


HOW TO GET HERE:

[BART subway from San Francisco downtown]
The closest station to the SF Hilton Hotel is the Powell Str. Station.
Berkeley is a safe 24 minute ride from the Powell Str. Station. You must
catch the Concord bound train and transfer onto a Richmond bound train
at the Oakland City Center-12th Str. Station. Trains on Sunday
rendezvous every 20 minutes as indicated below.
	Powell      12th Str.       Berkeley
	8:17  ----  8:31 8:31  ----  8:41
	8:37  ----  8:51 8:51  ----  9:01
	8:57  ----  9:11 9:11  ----  9:21
	9:17  ----  9:31 9:31  ----  9:41
	9:37  ----  9:51 9:51  ---- 10:01
It takes 15-20 minutes on foot from the Berkeley BART Station to reach
Bechtel Hall, which is located on the North-East part of campus. 
Bechtel Hall is just North of Evans Hall, home of the Computer Science
Division. North Gate is the nearest campus gate.
[TAXI]
You can take a taxi from the front of the Berkeley BART Station. Ask the
taxi driver to enter from East Gate on campus and let you off at Mining
Circle. The tallest building adjacent to the circle is Evans Hall.
Bechtel Hall is just north of the Evans.
[CAR]
Get off at the University Ave. exit from Interstate 80. The east end of
University Ave. is the West Gate to UC Berkeley. Most street parking is
free on Sunday, but it may be scarce and remember to read the signs. If
you feel you must park in a lot, we recommend UCB Parking Structure H
which is located at the corner of Hearst and La Loma Avenues. You must
buy an all day parking ticket from the vending machine located on the
2nd level (the only one in the structure). You need to prepare 12
quarters. Illegal parking in Berkeley is expensive.


CONTACT ADDRESS:

Hideyuki TAKAGI, Coordinator of this seminar (takagi at cs.berkeley.edu)
Lotfi A. Zadeh,  Director of BISC            (zadeh at cs.berkeley.edu)

Computer Science Division
University of California at Berkeley
Berkeley, CA 94720
FAX <+1>510-642-5775


From ira at linus.mitre.org  Fri Feb  5 10:06:53 1993
From: ira at linus.mitre.org (ira@linus.mitre.org)
Date: Fri, 5 Feb 93 10:06:53 -0500
Subject: vision position posting
Message-ID: <9302051506.AA09737@ellington.mitre.org>


	Neural Network Vision Research Position

The MITRE Corporation is looking for a Vision Modeler with an
excellent math background, knowledge of signal processing techniques,
considerable experience modeling biological low-level vision processes
and broad knowledge of current neural network learning algorithm
research. 

This is an *applied* research position which has as its goal the
application of vision modeling techniques to real tasks such as 2D and
3D object recognition in synthetic and real world imagery. This
position requires software implementation of models in C language. The
position may also involve management responsibilities.

The position is located in Bedford, Massachusetts. We are looking
for someone with availability within the next two months.  Interested
applicants should send a resume and representative publications to:


Ira Smotroff
Lead Scientist
The MITRE Corporation
MS K331
202 Burlington Rd.
Bedford, MA  01730-1420


From heiniw at sun1.eeb.ele.tue.nl  Fri Feb  5 09:56:17 1993
From: heiniw at sun1.eeb.ele.tue.nl (Heini Withagen)
Date: Fri, 5 Feb 1993 15:56:17 +0100 (MET)
Subject: Does backprop need the derivative ??
Message-ID: <9302051456.AA02038@sun1.eeb.ele.tue.nl>

A non-text attachment was scrubbed...
Name: not available
Type: text
Size: 1054 bytes
Desc: not available
Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/b760eda0/attachment.ksh

From wallyn at capsogeti.fr  Fri Feb  5 13:05:20 1993
From: wallyn at capsogeti.fr (Alexandre Wallyn)
Date: Fri, 5 Feb 93 19:05:20 +0100
Subject: Neural networks in Product modelling
Message-ID: <9302051805.AA13434@gizmo>


I am trying to evaluate the state of the art in the connectionist
applications in Product Modelling (or engineering design).

After looking in several journals (Neural Networks, IJCNN proceedings,
Neuro-Nimes, and some history of connectionist mailing list), I only
found:

"Neural Network in Engineering Design" (H.Adeli, IJCNN 1990) (very general)
Indirect quotations of general work in AI Wright University (1988)
Modelling of MOS components in University of Dortmund (1990)
and CadChem product of AIWare for product modelling and chemical formulation
(seem to be uses by General Tire and Good Year).

Are these applications in product modelling so scarce, or are they published
in other forums ?

I thank you in advance for your help.

I will, of course, publish a summary of the replies.

Alexandre Wallyn
CAP GEMINI INNOVATION
86-90, rue Thiers
92513 BOULOGNE
FRANCE
wallyn at capsogeti.fr


From ira at linus.mitre.org  Fri Feb  5 10:14:46 1993
From: ira at linus.mitre.org (ira@linus.mitre.org)
Date: Fri, 5 Feb 93 10:14:46 -0500
Subject: vision position: US Citizens only
Message-ID: <9302051514.AA09747@ellington.mitre.org>


Sorry to clutter your mail boxes. 

The Neural Network Vision Position at The MITRE Corporation is open
only to US Citizens.

Ira Smotroff


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Fri Feb  5 22:55:28 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Fri, 05 Feb 93 22:55:28 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Fri, 05 Feb 93 15:56:17 +0100.
             <9302051456.AA02038@sun1.eeb.ele.tue.nl> 
Message-ID: <mailman.574.1149540256.24850.connectionists@cs.cmu.edu>


    In his paper, 'An Empirical Study of Learning Speed in Back-Propagation
    Networks', Scott E. Fahlmann shows that with the encoder/decoder problem
    it is possible to replace the derivative of the transfer function by
    a constant. I have been able to reproduce this example. However, for
    several other examples, it was not possible to get the network 
    converged using a constant for the derivative.
    
Interesting.  I just tried this on encoder problems and a couple of other
simple things, and leapt to the conclusion that it was a general
phenomenon.  It seems plausible to me that any "derivative" function that
preserves the sign of the error and doesn't have a "flat spot" (stable
point of 0 derivative) would work OK, but I don't know of anyone who has
made an extensive study of this.

I'd be interested in hearing more about the problems you've encountered and
about any results others send to you.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From marwan at sedal.su.oz.au  Sat Feb  6 07:49:53 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Sat, 6 Feb 1993 23:49:53 +1100
Subject: Does backprop need the derivative ??
Message-ID: <9302061249.AA17234@sedal.sedal.su.OZ.AU>

As the intention of the inquirer is the analog implementation of
backprop, I see two problems: 1- the question whether the derivative can
be replaced by a constant, and more importantly 2- whether the precision
of the analog implementation will be high enough for backprop to work.

Regarding (1), it is likely as Scott Fahlman suggested any derivative that
"preserves" the error sign may do the job. The question however is the
implication in terms of convergence speed, and the comparison thereof with
perturbation type training methods.

Regarding (2), there has been several reports indicating that
backpropagation simply does not work when the number of bits is reduced
towards 6-8 bits! 

	Marwan

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From jlm at crab.psy.cmu.edu  Sat Feb  6 08:39:43 1993
From: jlm at crab.psy.cmu.edu (James L. McClelland)
Date: Sat, 6 Feb 93 08:39:43 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@sef-pmax.slisp.cs.cmu.edu's message of Fri, 05 Feb 93 22:55:28 EST <Added.8fQtZ_i00Udb4=e04m@andrew.cmu.edu>
Message-ID: <9302061339.AA19977@crab.psy.cmu.edu.noname>


Re the discussion concerning replacing the derivative of
the activations of units with a constant:

Some work has been done using the activation rather than the
derivative of the activation by Nestor Schmajuk.  He is interested
in biologically plausible models and tends to keep hidden units in the
bottom half of the sigmoid.  In that case they can be approximated
by exponentials and so the derivative can be approximated by the
activation.

Approx ref: Schmajuk and DiCarlo, Psychological Review, 1992

 - Jay McClelland


From ljubomir at darwin.bu.edu  Sat Feb  6 11:17:56 1993
From: ljubomir at darwin.bu.edu (Ljubomir Buturovic)
Date: Sat, 6 Feb 93 11:17:56 -0500
Subject: Does backprop need the derivative ??
Message-ID: <9302061617.AA13641@darwin.bu.edu>

Mr. Heini Withagen says: 

> I am working on an analog chip implementing a feedforward
> network and I am planning to incorporate backpropagation learning
> on the chip. If it would be the case that the backpropagation
> algorithm doesn't need the derivative, it would simplify the
> design enormously.

We have trained multilayer perceptron without derivatives,
using simplex algorithm for multidimensional optimization
(not to be confused with simplex algorithm for linear
programming). From our experiments, it turns out that it 
can be done, however the number of weights is seriously
limited, since the memory complexity of simplex is N^2,
where N is the total number of variable weights in the
network. See reference for further details (the reference
is available as a LaTeX file from ljubomir at darwin.bu.edu). 
 
Lj. Buturovic, Lj. Citkusev, ``Back Propagation and
Forward Propagation,'' in Proc. Int. Joint Conf. Neural
Networks, (Baltimore, MD), 1992, pp. IV-486 -- IV-491.
 
Ljubomir Buturovic
Boston University 
BioMolecular Engineering Research Center
36 Cummington Street, 3rd Floor
Boston, MA 02215

office: 617-353-7123
home:   617-738-6487 


From gary at cs.ucsd.edu  Sat Feb  6 11:20:57 1993
From: gary at cs.ucsd.edu (Gary Cottrell)
Date: Sat, 6 Feb 93 08:20:57 -0800
Subject: Does backprop need the derivative ??
Message-ID: <9302061620.AA29550@odin.ucsd.edu>

I happen to know it doesn't work for a more complicated encoder 
problem: Image compression. When Paul Munro & I were first doing
image compression back in 86, the error would go down and then
back up! Rumelhart said: "there's a bug in your code" and indeed
there was: we left out the derivative on the hidden units. -g.


From radford at cs.toronto.edu  Sun Feb  7 12:24:15 1993
From: radford at cs.toronto.edu (Radford Neal)
Date: Sun, 7 Feb 1993 12:24:15 -0500
Subject: Does backprop need the derivative?
Message-ID: <93Feb7.122429edt.227@neuron.ai.toronto.edu>

Other posters have discussed, regarding backprop...

> ... the question whether the derivative can be replaced by a constant, 

To clarify, I believe the intent is that the "constant" have the same
sign as the derivative, but have constant magnitude.

Marwan Jabri says...

> Regarding (1), it is likely as Scott Fahlman suggested any derivative 
> that "preserves" the error sign may do the job. 

One would expect this to work only for BATCH training. On-line training
approximates the batch result only if the net result of updating the 
weights on many training cases mimics the summing of derivatives in
the batch scheme. This will not be the case if a training case where the
derivative is +0.00001 counts as much as one where it is +10000.

This is not to say it might not work in some cases. There's just no reason
to think that it will work generally.

    Radford Neal


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Sun Feb  7 12:56:03 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Sun, 07 Feb 93 12:56:03 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Sat, 06 Feb 93 23:49:53 +1100.
             <9302061249.AA17234@sedal.sedal.su.OZ.AU> 
Message-ID: <mailman.575.1149540256.24850.connectionists@cs.cmu.edu>


    As the intention of the inquirer is the analog implementation of
    backprop, I see two problems: 1- the question whether the derivative can
    be replaced by a constant, and more importantly 2- whether the precision
    of the analog implementation will be high enough for backprop to work.
    
    ...
    
    Regarding (2), there has been several reports indicating that
    backpropagation simply does not work when the number of bits is reduced
    towards 6-8 bits! 
    
It is true that several studies show a sudden failure of backprop learning
when you use fixnum arithmetic and reduce the number of bits per word.  The
point of failure seems to be problem-specific, but is often around 10-14
bits (incuding sign).

Marcus Hoehfeld and I studied this issue and found that the source of the
failure was a quantization effect: the learning algorithm needs to
accumulate lots of small steps, for weight-update or whatever, and since
these are smaller than half the low-order bit, it ends up accumulating a
lot of zeros instead.  We showed that if a form of probabilisitic rounding
(dithering) is used to smooth over these quantization steps, learning
continues on down to 4 bits or fewer, with only a gradual degradation in
learning time, number of units/weights required, and quality of the result.
This study used Cascor, but we believe that the results hold for backprop
as well.

    Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
    Numerical Precision Using the Cascade-Correlation Learning Algorithm"
    in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
    602-611.

Of course, a learning system implemented in analog hardware might have only
a few bits of accuracy due to noise and nonlinearity in the circuits, but
it wouldn't suffer from this quantization effect, since you get a sort of
probabilistic dithering for free.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From kolen-j at cis.ohio-state.edu  Sun Feb  7 11:31:20 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Sun, 7 Feb 93 11:31:20 -0500
Subject: Does backprop need the derivative ?? 
In-Reply-To: "James L. McClelland"'s message of Sat, 6 Feb 93 08:39:43 EST <9302061339.AA19977@crab.psy.cmu.edu.noname>
Message-ID: <9302071631.AA19877@pons.cis.ohio-state.edu>


Back prop does not need THE derivative.  I have some empirical results
which show that most of the internal mathematical operators of back prop
can be replaced by qualitatively similar operators.  I'm not talking about
reducing bit width, as most of the literature does.  I was interested in
what happens when you replace multiplication with maximum, the sigmoid with
a generic bump, etc.  What was suprising was that all the tweeks basically 
worked.  Back prop is "functionally" stable in the sense that the learning
functional ability remains regardless of minor shifts in internal
organization.  The reason that the reduced accuracy results are the way
that they are  can be traced to the loss of continuity rather than the 
loss of bits.

John Kolen


From gary at cs.UCSD.EDU  Sun Feb  7 13:09:19 1993
From: gary at cs.UCSD.EDU (Gary Cottrell)
Date: Sun, 7 Feb 93 10:09:19 -0800
Subject: Does backprop need the derivative ??
Message-ID: <9302071809.AA00283@odin.ucsd.edu>

The sign is always positive. Hence not using it is an approximation that
preserves the sign. -g.


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Sun Feb  7 13:02:42 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Sun, 07 Feb 93 13:02:42 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Sat, 06 Feb 93 08:20:57 -0800.
             <9302061620.AA29550@odin.ucsd.edu> 
Message-ID: <mailman.576.1149540256.24850.connectionists@cs.cmu.edu>


    I happen to know it doesn't work for a more complicated encoder 
    problem: Image compression. When Paul Munro & I were first doing
    image compression back in 86, the error would go down and then
    back up! Rumelhart said: "there's a bug in your code" and indeed
    there was: we left out the derivative on the hidden units. -g.

I can see why not using the true derivative of the sigmoid, but just an
approximation that preserves the sign, might cause learning to bog down,
but I don't offhand see how it could cause the error to go up, at least in
a net with only one hidden layer and with a monotonic activation function.

I wonder if this problem would also occur in a net using the "sigmoid prime
offset", which adds a small constant to the derivative of the sigmoid.  I
haven't seen it.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From marwan at sedal.su.oz.au  Sun Feb  7 18:13:36 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Mon, 8 Feb 1993 10:13:36 +1100
Subject: Does backprop need the derivative ??
Message-ID: <9302072313.AA24874@sedal.sedal.su.OZ.AU>

> It is true that several studies show a sudden failure of backprop learning
> when you use fixnum arithmetic and reduce the number of bits per word.  The
> point of failure seems to be problem-specific, but is often around 10-14
> bits (incuding sign).
> 
> Marcus Hoehfeld and I studied this issue and found that the source of the
> failure was a quantization effect: the learning algorithm needs to
> accumulate lots of small steps, for weight-update or whatever, and since
> these are smaller than half the low-order bit, it ends up accumulating a
> lot of zeros instead.  We showed that if a form of probabilisitic rounding
> (dithering) is used to smooth over these quantization steps, learning
> continues on down to 4 bits or fewer, with only a gradual degradation in
> learning time, number of units/weights required, and quality of the result.
> This study used Cascor, but we believe that the results hold for backprop
> as well.
> 
>     Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
>     Numerical Precision Using the Cascade-Correlation Learning Algorithm"
>     in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
>     602-611.
> 

Yun Xie and I have tried simular experiments on the Sonar and ECG data,
and it is fair to say that standard backprop gives up about 10 bits [2].
In a closer look at the quantisation effects you would find that the
signal/noise ratio depends on the number of layers[1]. As you go deeper you
require less precision. This would be a source of variation between
backprop and cascor.

> Of course, a learning system implemented in analog hardware might have only
> a few bits of accuracy due to noise and nonlinearity in the circuits, but
> it wouldn't suffer from this quantization effect, since you get a sort of
> probabilistic dithering for free.
> 

Hmmm... precision also suffers from number of operations in analog
implementations. The free dithering you get is every where including in
your errors! The gradient descent turns into a yoyo. This is well 
explained in [2, 3].

The best way of using backprop or more efficiently, conjuguate gradient is
to do the training off-chip and then to download the (truncated) weights.
Our experience in the training of real analog chips shows that some
further in-loop training is required. Note our chips were ultra low power
and you may have less problems with strong inversion implementations.

Regarding the idea of Simplex that has been suggested. The inquirer was
talking about on-chip learning. Have you in your experiments done a
limited precision Simplex? Have you tried it on a chip in in-loop mode?
Philip Leong here has tried a similar idea (I think) a while back.  The
problem with this approach is that you need to a have a very good guess at
your starting point as the Simplex will move you from one vertex (feasible
solution) to another while expanding the weight solution space. 
Philip's experience is that it does work for small problems when you have
a good guess!

At the last NIPS, there were 4 posters about learning in or for analog
chips. The inquirer may wish to consult these papers (two at least were 
advertised deposited in the neuroprose archive, one by Gert Cauwengergh 
and one by Barry Flower and I).

So far, for us, the most reliable analog chips training algorithm has been the
combined search algorithm (modified weight perturbation and partial random
search) [3]. I will be very interested in hearing more about experiments
where analog chips are trained.

Marwan

[1] Yun Xie and M. Jabri, Analysis of the Effects of Quantization in
Multi-layer Neural Networks Using A Statistical Model, IEEE
Transactions on Neural Networks, Vol. 3, No. 2, pp. 334-338, March, 1992.

[2] M. Jabri, S. Pickard, P. Leong and Y. Xie, Algorithms and
Implementation Issues in Analog Low Power Learning Neural Nertwork
Chips,  To appear in the Intenational Journal on VLSI
Signal Processing, early 1993, USA.

[3] Y. Xie and M. Jabri, On the Training of Limited Precision Multi-layer
Perceptrons. Proceedings of the International Joint Conference on
Neural Networks, pp III-942-947, July 1992, Baltimore, USA.

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From takagi at diva.berkeley.edu  Sun Feb  7 14:36:59 1993
From: takagi at diva.berkeley.edu (Hideyuki Takagi)
Date: Sun, 7 Feb 93 11:36:59 -0800
Subject: attendance restriction at BISC Special Seminar
Message-ID: <9302071936.AA00803@diva.Berkeley.EDU>

	           ORGANIZATIONAL CHANGE 
		           in 
		Extended BISC Special Seminar

	     10:30AM-5:45PM, March 28 (Sunday), 1993
	     Sibley Auditorium (210) in Bechtel Hall
	   University of California, Berkeley CA 94720

Dear Colleagues:

This is to inform you of an organizational change in the Extended BISC
Special Seminar which was announced on February 4. 

Most of speakers in the regular BISC Seminar are associated with
companies and universities in the Bay area. The motivations for the
Extended BISC Seminar was to take  advantage of the presence in the Bay
area of some of the leading contributors to fuzzy logic and neural 
network theory from abroad, who will be participating in FUZZ-IEEE'93 /
ICNN'93.

A problem which became apparent is that because both the Extended BISC
Seminar and the FUZZ-IEEE'93/ICNN'93 tutorials are scheduled to take
place on the same day, the BISC Seminar may have an adverse effect on
registration for the conference tutorilas.

To resolve this problem, it was felt that it may be nessary to restrict
attendance at the Extended BISC Seminar to students and faculty in the
Bay area who normally attend the BISC Seminar. In this way, the Extended
BISC Seminar would serve its usual role and at the same time bring to
the Berkeley Campus some of the leading contributors to soft computing. 

The publicity for the Extended BISC Seminar will state that attendance
is limited to students and faculty in the Bay area.

Sincerely,


BISC (Berkeley Initiative for Soft Computing)
---------------------------------------------


From mav at cs.uq.oz.au  Sun Feb  7 19:33:21 1993
From: mav at cs.uq.oz.au (Simon Dennis)
Date: Mon, 08 Feb 93 10:33:21 +1000
Subject: Learning in Memory Technical Report
Message-ID: <9302080033.AA10081@uqcspe.cs.uq.oz.au>


The following technical report is available for anonymous ftp.


TITLE: Integrating Learning into Models of Human Memory: 
	The Hebbian Recurrent Network

AUTHORS: Simon Dennis and Janet Wiles

ABSTRACT: 

We develop an interactive model of human memory called the Hebbian
Recurrent Network (HRN) which integrates work in the mathematical
modeling of memory with that in error correcting connectionist
networks.  It incorporates the matrix model (Pike, 1984; Humphreys,
Bain & Pike, 1989) into the Simple Recurrent Network (SRN, Elman,
1989).  The result is an architecture which has the desirable memory
characteristics of the matrix model such as low interference and
massive generalization but which is able to learn appropriate encodings
for items, decision criteria and the control functions of memory which
have traditionally been chosen a priori in the mathematical memory
literature.  Simulations demonstrate that the HRN is well suited to a
recognition task inspired by typical memory paradigms.  When compared
against the SRN the HRN is able to learn longer lists, generalizes from
smaller training sets, and is not degraded significantly by increasing
the vocabulary size.


Please mail correspondence to mav at cs.uq.oz.au

Ftp Instructions:


$ ftp exstream.cs.uq.oz.au
Connected to exstream.cs.uq.oz.au.
220 exstream FTP server (Version 6.12 Fri May 8 16:33:17 EST 1992) ready.
Name (exstream.cs.uq.oz.au:mav): anonymous
331 Guest login ok, send e-mail address as password.
Password:
230-                 Welcome to ftp.cs.uq.oz.au
230-This is the University of Queensland Computer Science Anonymous FTP server. 
230-For people outside of the department, please restrict your usage to outside
230-of the hours 8am to 6pm.
230-
230-The local time is Mon Feb  8 10:26:05 1993
230-
230 Guest login ok, access restrictions apply.
ftp> cd pub/TECHREPORTS/department
250 CWD command successful.
ftp> bin
200 Type set to I.
ftp> get TR0252.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for TR0252.ps.Z (160706 bytes).
226 Transfer complete.
local: TR0252.ps.Z remote: TR0252.ps.Z
160706 bytes received in 0.71 seconds (2.2e+02 Kbytes/s)
ftp> quit
221 Goodbye.
$ 

Printing Instructions:

$ zcat TR0252.ps.Z | lpr


From efiesler at idiap.ch  Mon Feb  8 03:22:31 1993
From: efiesler at idiap.ch (E. Fiesler)
Date: Mon, 8 Feb 93 09:22:31 +0100
Subject: Does backprop need the derivative ??
Message-ID: <9302080822.AA22484@idiap.ch>

Marwan Jabri wrote:

> Date: Sat, 6 Feb 1993 23:49:53 +1100
> From: Marwan Jabri <marwan at sedal.su.oz.au>
> Subject: Re: Does backprop need the derivative ??
> 
> As the intention of the inquirer is the analog implementation of
> backprop, I see two problems: 1- the question whether the derivative can
> be replaced by a constant, and more importantly 2- whether the precision
> of the analog implementation will be high enough for backprop to work.
> 
> Regarding (1), ...
> 
> Regarding (2), there has been several reports indicating that
> backpropagation simply does not work when the number of bits is reduced
> towards 6-8 bits! 

This is often reported for standard backpropagation. However, a simple
extension of backpropagation can make it work for any precision;
up to 1-2 bits. I'll append the reference(s) below.

				E. Fiesler
				Directeur de Recherche
				IDIAP
				Case postale 609
				CH-1920 Martigny
				Switzerland


@InProceedings{Fiesler-90,
        Author       = "E. Fiesler and A. Choudry and H. J. Caulfield",
        Title        = "A Weight Discretization Paradigm for Optical
                        Neural Networks",
        BookTitle    = "Proceedings of the International Congress on
                        Optical Science and Engineering",
        Volume       = "SPIE-1281",
        Pages        = "164--173",
        Publisher    = "The International Society for Optical
                        Engineering Proceedings",
        Address      = "Bellingham, Washington, U.S.A.",
        Year         = "1990",
        ISBN         = "0-8194-0328-8",
        Language     = "English" }

@Article{Fiesler-93,
        Author       = "E. Fiesler and A. Choudry and H. J. Caulfield",
        Title        = "A Universal Weight Discretization Method for
                        Multi-Layer Neural Networks",
        Journal      = "IEEE Transactions on Systems, Man, and Cybernetics
                        (IEEE-SMC)",
        Publisher    = "The Institute of Electrical and Electronics Engineers
                        (IEEE), Inc.",
        Address      = "New York, New York",
        Year         = "1993",
        ISSN         = "0018-9472",
        Language     = "English",
        Note         = "Accepted for publication." }


From annette at cdu.ucl.ac.uk  Mon Feb  8 05:13:06 1993
From: annette at cdu.ucl.ac.uk (Annette Karmiloff-Smith)
Date: Mon, 8 Feb 93 10:13:06 GMT
Subject: Cognitive Development for Connectionists
Message-ID: <9302081013.AA14475@cdu.ucl.ac.uk>


Below are details of two articles and a book which may be
of interest to connectionists:

A.Karmiloff-Smith (1992), Connection Science, Vo.4, Nos. 3 & 4, 253-
269.
NATURE, NURTURE ANDS PDP: Preposterous Developmental Postulates?
(N.B. the question mark - I end on: Promising Developmental
Postulates!)

Abstract:  In this article I discuss the nature/nurture debate in terms
of evidence and theorizing from the field of cognitive development, and
pinpoint various problems where the Connectionist framework needs to
be further explored from this perspective.  Evidence from normal and
abnormal developmental phenotypes points to some domain-specific
constraints on early learning.  Yet, by invoking the dynamics of
epigenesis, I avoid recourse to a strong Nativist stance and remain
within the general spirit of Connectionism.
_____________________________________________________________

A. Karmiloff-Smith (1992)  Technical Report TR.PDP.CNS.92.7, Carnegie
Mellon University, Pittsburgh.
ABNORMAL PHENOTYPES AND THE CHALLENGES THEY POSE TO
CONNECTIONIST MODELS OF DEVELOPMENT

Abstract:  The comparison of different abnormal phenotypes (e.g.
Williams syndrome, Down syndrome, autism, hydrocephalus with
associated myelomeningocele) raises a number of questions about
domain-general versus domain-specific processes and suggests that
development stems from domain-specific predispositions which
channel infantsU attention  to proprietary inputs.  This is not to be
confused with a strong Nativist position.  Genetically fully specified
modules are not the starting point of development.  Rather, a process of
gradual modularization builds on skeletal domain-specific
predispositions (architectural and/or representational) which give the
normal infant a small but significant head-start. It is argued that Down
syndrome infants may lack these head-starts, whereas individuals with
Williams syndrome, autism and hydrocephalus with associated
myelomeningocele have a head-start in selected domains only, leading
to different cognitive profiles despite equivalent input.  Stress is placed
on the importance of exploring a developing system, rather than a
lesioned adult system. The position developed in the paper not only
contrasts with the strong Nativist stance, but also with the view that
domain-general processes are simply applied to whatever inputs the
child encounters. The comparison of different phenotypical outcomes is
shown to pose interesting challenges to connectionist simulations of
development.
______________________________________________________________

A.Karmiloff-Smith (1992) BEYOND MODULARITY: A DEVELOPMENTAL
PERSPECTIVE ON COGNITIVE SCIENCE.  MIT Press/Bradford Books.

A book intended to excite connectionists and other non-
developmentalists about the essential role that a developmental
perspective has in understanding the special nature of human cognition
compared to other species.
Contents:
1. Taking development seriously
2. The child as a linguist
3. The child as a physicist
4. The child as a mathematician
5. The child as a psychologist
6. The child as a notator
7. Nativism, domain specificity and PiagetUs constructivism
8. Modelling development: representational redescription
    and connectionism
9. Concluding speculations

Reprints of articles obtainable from:
Annette Karmiloff-Smith
Medical Research Council
Cognitive Development Unit
London
WC1H 0AH.
U.K.


From SCHOLTES at ALF.LET.UVA.NL  Mon Feb  8 06:19:00 1993
From: SCHOLTES at ALF.LET.UVA.NL (SCHOLTES@ALF.LET.UVA.NL)
Date: 08 Feb 1993 12:19 +0100 (MET)
Subject: PhD Dissertation Available
Message-ID: <346B17ED606070C5@VAX1.SARA.NL>

===================================================================

                Ph.D. DISSERTATION AVAILABLE

                           on

Neural Networks, Natural Language Processing, Information Retrieval

                292 pages and over 350 references

===================================================================

A Copy of the dissertation "Neural Networks in Natural Language Processing
and Information Retrieval" by Johannes C. Scholtes can be obtained for
cost price and fast airmail- delivery at US$ 25,-.

Payment by Major Creditcards (VISA, AMEX, MC, Diners) is accepted and
encouraged. Please include Name on Card, Number and Exp. Date. Your Credit
card will be charged for Dfl. 47,50.

Within Europe one can also send a Euro-Cheque for Dfl. 47,50 to:

    University of Amsterdam
    J.C. Scholtes
    Dufaystraat 1
    1075 GR Amsterdam
    The Netherlands

Do not forget to mention a surface shipping address. Please allow 2-4
weeks for delivery.


                            Abstract

1.0  Machine Intelligence

For over fifty years the two main directions in machine intelligence (MI),
neural networks (NN) and artificial intelligence (AI), have been studied by
various persons with many different backgrounds. NN and AI seemed to
conflict with many of the traditional sciences as well as with each other.
The lack of a long research history and well defined foundations has always
been an obstacle for the general acceptance of machine intelligence by
other fields.

At the same time, traditional schools of science such as mathematics and
physics developed their own tradition of new or "intelligent" algorithms.
Progress made in the field of statistical reestimation techniques such as
the Hidden Markov Models (HMM) started a new phase in speech recognition.
Another application of the progress of mathematics can be found in the
application of the Kalman filter in the interpretation of sonar and radar
signals. Much more examples of such "intelligent" algorithms can be found
in the statistical classification en filtering techniques of the study of
pattern recognition (PR).


Here, the field of neural networks is studied with that of pattern
recognition in mind.  Although only global qualitative comparisons are
made, the importance of the relation between them is not to be
underestimated. In addition it is argued that neural networks do indeed add
something to the fields of MI and PR, instead of competing or conflicting
with them.

2.0  Natural Language Processing

The study of natural language processing (NLP) exists even longer than that
of MI.  Already in the beginning of this century people tried to analyse
human language with machines. However, serious efforts had to wait until
the development of the digital computer in the 1940s, and even then, the
possibilities were limited. For over 40 years, symbolic AI has been the
most important approach in the study of NLP. That this has not always been
the case, may be concluded from the early work on NLP by Harris. As a
matter of fact, Chomsky's Syntactic Structures was an attack on the lack of
structural properties in the mathematical methods used in those days. But,
as the latter's work remained the standard in NLP, the former has been
forgotten completely until recently. As the scientific community in NLP
devoted all its attention to the symbolic AI-like theories, the only useful
practical implementation of NLP systems were those that were based on
statistics rather than on linguistics. As a result, more and more
scientists are redirecting their attention towards the statistical
techniques available in NLP. The field of connectionist NLP can be
considered as a special case of these mathematical methods in NLP.

More than one reason can be given to explain this turn in approach. On the
one hand, many problems in NLP have never been addressed properly by
symbolic AI. Some examples are robust behavior in noisy environments,
disambiguation driven by different kinds of knowledge, commensense
generalizations, and learning (or training) abilities.  On the other hand,
mathematical methods have become much stronger and more sensitive to
specific properties of language such as hierarchical structures.

Last but not least, the relatively high degree of success of mathematical
techniques in commercial NLP systems might have set the trend towards the
implementation of simple, but straightforward algorithms.

In this study, the implementation of hierarchical structures and semantical
features in mathematical objects such as vectors and matrices is given much
attention. These vectors can then be used in models such as neural
networks, but also in sequential statistical procedures implementing
similar characteristics.

3.0  Information Retrieval

The study of information retrieval (IR) was traditionally related to
libraries on the one hand and military applications on the other. However,
as PC's grew more popular, most common users loose track of the data they
produced over the last couple of years.  This, together with the
introduction of various "small platform" computer programs made the field
of IR relevant to ordinary users.

However, most of these systems still use techniques that have been
developed over thirty years ago and that implement nothing more than a
global surface analysis of the textual (layout) properties.  No deep
structure whatsoever, is incorporated in the decision whether or not to
retrieve a text.

There is one large dilemma in IR research. On the one hand, the data
collections are so incredibly large, that any method other than a global
surface analysis would fail.  On the other hand, such a global analysis
could never implement a contextually sensitive method to restrict the
number of possible candidates returned by the retrieval system.  As a
result, all methods that use some linguistic knowledge exist only in
laboratories and not in the real world. Conversely, all methods that are
used in the real world are based on technological achievements from twenty
to thirty years ago.

Therefore, the field of information retrieval would be greatly indebted to
a method that could incorporate more context without slowing down. As
computers are only capable of processing numbers within reasonable time
limits, such a method should be based on vectors of numbers rather than on
symbol manipulations. This is exactly where the challenge is: on the one
hand keep up the speed, and on the other hand incorporate more context.  If
possible, the data representation of the contextual information must not be
restricted to a single type of media. It should be possible to incorporate
symbolic language as well as sound, pictures and video concurrently in the
retrieval phase, although one does not know exactly how yet...

Here, the emphasis is more on real-time filtering of large amounts of
dynamic data than on document retrieval from large (static) data bases. By
incorporating more contextual information, it should be possible to
implement a model that can process large amounts of unstructured text
without providing the end-user with an overkill of information.

4.0  The Combination

As this study is a very multi-disciplinary one, the risk exists that it
remains restricted to a surface discussion of many different problems
without analyzing one in depth. To avoid this, some central themes,
applications and tools are chosen. The themes in this work are
self-organization, distributed data representations and context. The
applications are NLP and IR, the tools are (variants of) Kohonen feature
maps, a well known model from neural network research.

Self-organization and context are more related to each other than one may
suspect. First, without the proper natural context, self-organization shall
not be possible. Next, self-organization enables one to discover contextual
relations that were not known before.

Distributed data representation may solve many of the unsolved problems in
NLP and IR by introducing a powerful and efficient knowledge integration
and generalization tool.  However, distributed data representation and
self-organization trigger new problems that should be solved in an elegant
manner.

Both NLP and IR work on symbolic language. Both have properties in common
but both focus on different features of language. In NLP hierarchical
structures and semantical features are important. In IR the amount of data
sets the limitations of the methods used.  However, as computers grow more
powerful and the data sets get larger and larger, both approaches get more
and more common ground. By using the same models on both applications, a
better understanding of both may be obtained.

Both neural networks and statistics would be able to implement
self-organization, distrib- uted data and context in the same manner. In
this thesis, the emphasis is on Kohonen feature maps rather than on
statistics. However, it may be possible to implement many of the techniques
used with regular sequential mathematical algorithms.

So, the true aim of this work can be formulated as the understanding of
self-organization, distributed data representation, and context in NLP and
IR, by in depth analysis of Kohonen feature maps.


==============================================================================


From george at psychmips.york.ac.uk  Mon Feb  8 08:32:38 1993
From: george at psychmips.york.ac.uk (George Bolt)
Date: Mon, 8 Feb 93 13:32:38 +0000 (GMT)
Subject: Does backprop need the derivative ??
Message-ID: <m0nLYbG-00026sC@psychmips.york.ac.uk>


Heini Withagen wrote:
In his paper, 'An Empirical Study of Learning Speed in Back-Propagation
Networks', Scott E. Fahlmann shows that with the encoder/decoder problem
it is possible to replace the derivative of the transfer function by
a constant. I have been able to reproduce this example. However, for
several other examples, it was not possible to get the network
converged using a constant for the derivative.


- end quote -

I've looked at BP learning in MLP's w.r.t. fault tolerance and found 
that the derivative of the transfer function is used to *stop* learning.
Once a unit's weights for some particular input (to that unit rather than
the network) are sufficiently developed for it to decide whether to output
0 or 1, then weight changes are approximately zero due to this derivative.
I would imagine that by setting it to a constant, then a MLP will over-
learn certain patterns and be unable to converge to a state of equilibrium,
i.e. all patterns are matched to some degree.

A better route would be to set the derivative function to a constant
over a range [-r,+r], where f[r] -
(sorry) f( |r| ) -> 1.0. To make individual units robust with respect
to weights, make r=c.a where f( |a| ) -> 1.0 and c is a small constant
multiplicative value.

- George Bolt

University of York, U.K.


From movellan at cogsci.UCSD.EDU  Mon Feb  8 20:33:19 1993
From: movellan at cogsci.UCSD.EDU (Javier Movellan)
Date: Mon, 8 Feb 93 17:33:19 PST
Subject: Does backprop need the derivative ??
In-Reply-To: Marwan Jabri's message of Sat, 6 Feb 1993 23:49:53 +1100 <9302061249.AA17234@sedal.sedal.su.OZ.AU>
Message-ID: <9302090133.AA16068@cogsci.UCSD.EDU>


My experience with Boltzmann machines and GRAIN/diffusion networks
(the continuous stochastic version of the Boltzmann machine) has been
that replacing the real gradient by its sign times a constant
accelerates learning DRAMATICALLY. I first saw this technique in one
of the original CMU tech reports on the Boltzmann machine. I believe
Peterson and Hartman and Peterson and Anderson also used this
technique, which they called "Manhattan updating", with the
deterministic Mean Field learning algorithm. I believe they had an
article in "Complex Systems" comparing Backprop and Mean-Field with
both with standard gradient descent and with Manhattan updating. 

It is my understanding that the Mean-Field/Boltzmann chip developed at
Bellcore uses "Manhattan Updating" as its default training method.
Josh Allspector is the person to contact about this.

At this point I've tried 4 different learning algorithms with
continuous and discrete stochastic networks and in all cases Manhattan
Updating worked better than straight gradient descent.The question is
why Manhattan updating works so well (at least in stochastic and
Mean-Field networks) ?

 One possible interpreation is that Manhattan updating limits the
influence of outliers and thus it performs something similar to robust
regression. Another interpretation is that Manhattan updating avoids
the saturation regions, where the error space becomes almost
flat in some dimensions, slowing down learning. 

One of the disadvantages of Manhattan updating is that sometimes one
needs to reduce the weight change constant at the end of learning. But
sometimes we also do this in standard gradient descent anyway.


           -Javier


From oby%firenze%venezia.ROCKEFELLER.EDU at ROCKVAX.ROCKEFELLER.EDU  Mon Feb  8 20:42:08 1993
From: oby%firenze%venezia.ROCKEFELLER.EDU at ROCKVAX.ROCKEFELLER.EDU (Klaus Obermayer)
Date: Mon, 8 Feb 93 20:42:08 -0500
Subject: No subject
Message-ID: <9302090142.AA01612@firenze>

The following article is available as a (hardcopy) preprint:

Obermayer K. and Blasdel G.G. (1993), Geometry of Orientation and 
Ocular Dominance Columns in Monkey Striate Cortex, J. Neurosci., 
in press.

Abstract:

In addition to showing that ocular dominance is organized in slabs 
and that orientation preferences are organized in linear sequences 
likely to reflect slabs, Hubel and Wiesel (1974) discussed the 
intriguing possibility that slabs of orientation might intersect 
slabs of ocular dominance at some consistent angle. Advances in 
optical imaging now make it possible to test this possibility 
directly. When maps of orientation are analyzed quantitatively, 
they appear to arise from a combination of at least two competing 
themes: one where orientation preferences change linearly along 
straight axes, remaining constant along perpendicular axes and 
forming iso-orientation slabs along the way, and one where 
orientation preferences change continuously along circular axes, 
remaining constant along radial axes and forming singularities at 
the centers of the spaces enclosed. When orientation patterns are 
compared with ocular dominance patterns from the same cortical 
regions, quantitative measures reveal: 1) that singularities tend 
to lie at the centers of ocular dominance columns, 2) that linear 
zones (arising where orientation preferences change along straight 
axes) tend to lie at the edges of ocular dominance columns, and 3) 
that the short iso-orientation bands within each linear zone tend 
to intersect the borders of ocular dominance slabs at angles of 
approximately 90$^o$. 

-----------------------------------------------------------------

The original article contains color figures which - for cost 
reasons - have to be reproduced black and white. If you would like
to obtain a copy, please send your surface mail address to:

                       Klaus Obermayer 
                 The Rockefeller University
                 oby at rockvax.rockefeller.edu

-----------------------------------------------------------------


From thgoh at iss.nus.sg  Tue Feb  9 01:05:52 1993
From: thgoh at iss.nus.sg (Goh Tiong Hwee)
Date: Tue, 9 Feb 1993 14:05:52 +0800 (WST)
Subject: Does Backprop need the derivative
Message-ID: <9302090605.AA08961@iss.nus.sg>


From fellous%hyla.usc.edu at usc.edu  Wed Feb 10 21:48:50 1993
From: fellous%hyla.usc.edu at usc.edu (Jean-Marc Fellous)
Date: Wed, 10 Feb 93 18:48:50 PST
Subject: CNE / USC Workshop Reminder and Update.
Message-ID: <9302110248.AA01295@hyla.usc.edu>


Thank you for posting the following final announcement:


*********************** Last Reminder and Update ************************

                        SCHEMAS AND NEURAL NETWORKS
 		   INTEGRATING SYMBOLIC AND SUBSYMBOLIC
                  APPROACHES TO COOPERATIVE COMPUTATION


   A Workshop sponsored by the

                      Center for Neural Engineering 
                    University of  Southern  California
                         Los Angeles, CA 90089-2520

                          April 13th and 14th, 1993

   Program  Committee:  Michael  Arbib  (Organizer),  John  Barnden,
   George  Bekey,  Francisco  Cervantes-Perez,  Damian  Lyons,  Paul
   Rosenbloom, Ron Sun, Akinori Yonezawa


   A previous announcement (reproduced below) announced a  registra-
   tion  fee of $150 and advertised the availability of hotel accom-
   modation at $70/night.

   To encourage the participation of qualified students we have made
   3 changes:

   1) We have appointed Jean-Marc Fellous as Student Chair  for  the
   meeting to coordinate the active involvement of such students.

   2) We offer a Student Registration Fee of only  $40  to  students
   whose  application is accompanied by a letter from their supervi-
   sor attesting to their student status.

   3) Mr. Fellous has identified a number of lower-cost housing  op-
   tions, and will respond to queries to fellous at rana.usc.edu
   The original announcement - with updated registration form - fol-
   lows:

   To design complex technological systems and  to  analyze  complex
   biological and cognitive systems, we need a multilevel methodolo-
   gy which combines a coarse-grain analysis of  cooperative or dis-
   tributed  computation  (we shall refer to the computing agents at
   this level as "schemas") with a  fine-grain  model  of  flexible,
   adaptive  computation (for which neural networks provide a power-
   ful general paradigm).  Schemas provide  a language  for  distri-
   buted  artificial  intelligence,  perceptual  robotics, cognitive
   modeling, and brain theory which is "in the style of the  brain",
   but  at a relatively high level of abstraction relative to neural
   networks.

   The proposed workshop will provide a 2-hour introductory tutorial
   and  problem statement by Michael Arbib, and sessions in which an
   invited paper will be followed  by  several  contributed  papers,
   selected  from  those  submitted in response to this call for pa-
   pers.  Preference will be given to papers which present practical
   examples  of,  theory  of,  and/or methodology for the design and
   analysis of complex systems in which the overall specification or
   analysis is conducted in terms of schemas, and where some but not
   necessarily all of the schemas are  implemented  in  neural  net-
   works.

   A list of sample topics for contributions is as follows, where  a
   hybrid  approach  means one in which the abstract schema level is
   integrated with neural or other lower level models:

        Schema Theory as a description language for
        neural networks
        Modular neural networks
        Linking DAI to Neural Networks to Hybrid
        Architecture
        Formal Theories of Schemas        
        Hybrid approaches to integrating planning &
        reaction
        Hybrid approaches to learning
        Hybrid approaches to commonsense reasoning by
        integrating neural networks and rule-
        based reasoning (using schema for the
        integration)
        Programming Languages for Schemas and Neural
        Networks
        Concurrent Object-Oriented Programming for
        Distributed AI and Neural Networks
        Schema Theory Applied in Cognitive Psychology,
        Linguistics, Robotics, AI and Neuroscience


   Prospective contributors should send a hard copy of  a  five-page
   extended  abstract,   including figures with informative captions
   and full references (either by regular mail or fax)  by  February
   15,  1993   to:

            Michael  Arbib,  
     Center  for  Neural Engineering
   University of Southern California
      Los  Angeles,  CA  90089-2520
                    USA 
    
     Tel:    (213)    740-9220
     Fax:    (213)    746-2863
     arbib at pollux.usc.edu]  

   Please include your full address, including fax and email, on the
   paper.

   Notification of acceptance or rejection will be sent by email  no
   later  than March 1, 1993.  There are currently no plans to issue
   a formal proceedings of full papers, but revised versions of  ac-
   cepted abstracts received prior to April 1, 1993 will be collect-
   ed with the full text of the Tutorial in a CNE  Technical  Report
   which  will  be made available to registrants at the start of the
   meeting.   [A useful way to structure  such  an  abstract  is  in
   short  numbered sections, where each section presents (in a small
   type face!) the material corresponding to one  transparency/slide
   in  a  verbal presentation.  This will make it  easy for an audi-
   ence to take notes if they have a copy of the  abstract  at  your
   presentation.]

   Hotel Information:  Attendees may register at the hotel of  their
   choice,  but  the  closest hotel to USC is the University Hilton,
   3540 South Figueroa Street, Los Angeles, CA 90007, Phone:   (213)
   748-  4141,  Reservation:  (800) 872-1104, Fax:  (213) 748- 0043.
   A  single  room  costs  $70/night  while  a  double  room   costs
   $75/night.   Workshop  participants  must  specify  that they are
   "Schemas and Neural Networks Workshop" attendees to avail of  the
   above  rates.    Information  on student accommodation may be ob-
   tained   from   the    Student    Chair,    Jean-Marc    Fellous,
   fellous at rana.usc.edu.

   The registration fee of $150 ($40 for qualified students who  in-
   clude  a  "certificate of student status" from their advisor) in-
   cludes a copy of the abstracts, coffee breaks, and a dinner to be
   held on the evening of April 13th.

   Those wishing to register should send a check payable to  "Center
   for Neural Engineering, USC" for $150 ($40 for students) together
   with the following information to:

           Paulina Tagle
   Center  for  Neural Engineering
   University of Southern California
          University Park
     Los  Angeles,  CA  90089-2520
                    USA 
 

----------------------------------------------------------

              SCHEMAS AND NEURAL NETWORKS 
           Center for  Neural  Engineering  
                        USC
               April 13 - 14, 1993


   NAME:  ___________________________________________

   ADDRESS: _________________________________________

   PHONE NO.: _______________ FAX:___________________

   EMAIL: ___________________________________________


   I intend to submit a paper: YES  [   ]      NO   [   ]


From ljubomir at darwin.bu.edu  Wed Feb 10 21:12:30 1993
From: ljubomir at darwin.bu.edu (Ljubomir Buturovic)
Date: Wed, 10 Feb 93 21:12:30 -0500
Subject: Does backprop need the derivative?
Message-ID: <9302110212.AA07255@darwin.bu.edu>

Marwan Jabri:

> Regarding the idea of Simplex that has been suggested. The inquirer was
> talking about on-chip learning. Have you in your experiments done a
> limited precision Simplex? Have you tried it on a chip in in-loop mode?
> Philip Leong here has tried a similar idea (I think) a while back.  The
> problem with this approach is that you need to a have a very good guess at
> your starting point as the Simplex will move you from one vertex (feasible
> solution) to another while expanding the weight solution space. 
> Philip's experience is that it does work for small problems when you have
> a good guess!

No, we did not try limited precision Simplex, since the method has
another serious limitation, which is memory complexity. So there is
no point performing such refined studies until this problem is 
resolved, let alone on-chip implementation. The biggest problem we
tried it on succesfully was 11-dimensional (i. e., input samples were
11-dimensional vectors). The initial guess was pseudo-random, like
in back-propagation. In another, 12-dimensional example, it did not
do well (neither did back-prop, but Simplex was much worse), so it
might be true that it needs a good starting point.   

Ljubomir Buturovic
Boston University 
BioMolecular Engineering Research Center
36 Cummington Street, 3rd Floor
Boston, MA 02215

office: 617-353-7123
home:   617-738-6487 


From mozer at dendrite.cs.colorado.edu  Thu Feb 11 23:47:27 1993
From: mozer at dendrite.cs.colorado.edu (Michael C. Mozer)
Date: Thu, 11 Feb 1993 21:47:27 -0700
Subject: Preprint:  Neural net architectures for temporal sequence processing
Message-ID: <199302120447.AA06812@neuron.cs.colorado.edu>

-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-
		      PLEASE DO NOT POST TO OTHER BOARDS 
-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-


Neural net architectures for temporal sequence processing

Michael C. Mozer
Department of Computer Science 
University of Colorado


I present a general taxonomy of neural net architectures for processing
time-varying patterns.  This taxonomy subsumes many existing architectures in
the literature, and points to several promising architectures that have yet
to be examined.  Any architecture that processes time-varying patterns
requires two conceptually distinct components:  a short-term memory that holds
on to relevant past events and an associator that uses the short-term memory
to classify or predict.  The taxonomy is based on a characterization of
short-term memory models along the dimensions of form, content, and
adaptability.  Experiments on predicting future values of a financial time
series (US dollar-Swiss franc exchange rates) are presented using several
alternative memory models.  The results of these experiments serve as a
baseline against which more sophisticated architectures can be compared.


To appear in:  A. S. Weigend & N. A. Gershenfeld (Eds.), _Predicting the future
and understanding the past_.  Redwood City, CA: Addison-Wesley.  Spring 1993.


-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-

To retrieve:

   unix> ftp archive.cis.ohio-state.edu
   Name: anonymous 
   230 Guest ogin ok, access restrictions apply.
   ftp> cd pub/neuroprose
   ftp> binary
   ftp> get mozer.architectures.ps.Z
   200 PORT command successful.
   ftp> quit
   unix> zcat mozer.architectures.ps.Z | lpr

Warning:  May not print on wimpy laser printers.


From kolen-j at cis.ohio-state.edu  Tue Feb  9 07:51:53 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 07:51:53 -0500
Subject: Does backprop need the derivative?
In-Reply-To: Radford Neal's message of Sun, 7 Feb 1993 12:24:15 -0500 <93Feb7.122429edt.227@neuron.ai.toronto.edu>
Message-ID: <9302091251.AA27813@pons.cis.ohio-state.edu>

The sign of the derivative is always positive ( remember o(1-o) and 0<o<1).
What is important is the general shape of derivative:  maximumal at 0.5,
minimal at the extremes. I have replaced the derivative with other
functions ( d(x)=c, d(x)=cx, d(x)=min(x,1-x), d(x)=x(1-x)+c ) and bp works
when there is a bump.

John Kolen


From darken at learning.siemens.com  Tue Feb  9 08:19:42 1993
From: darken at learning.siemens.com (Christian Darken)
Date: Tue, 9 Feb 93 08:19:42 EST
Subject: Does backprop need the derivative?
Message-ID: <9302091319.AA14794@learning.siemens.com>


>Other posters have discussed, regarding backprop...  
>
>> ... the question whether the derivative can be replaced by a constant, 
>
>To clarify, I believe the intent is that the "constant" have the same
>sign as the derivative, but have constant magnitude.


I haven't been following this thread, but the following reference
may be helpful to those that are.

Blum (Annals of Math. Statistics vol. 25 1954 p.385) shows that if the
"constant magnitude" is going to zero (so that the system is convergent)
the convergence is not to a minimum of the expected error (this is usually
what we want backprop to do), but to a minimum of the *median* of the error.


Chris Darken
darken at learning.scr.siemens.com


From munro at lis.pitt.edu  Thu Feb 11 11:14:44 1993
From: munro at lis.pitt.edu (fac paul munro)
Date: Thu, 11 Feb 93 11:14:44 EST
Subject: Summary of "Does backprop need the derivative ??"
In-Reply-To: Mail from 'Heini Withagen <heiniw@sun1.eeb.ele.tue.nl>'
      dated: Tue, 9 Feb 1993 11:46:06 +0100 (MET)
Message-ID: <9302111614.AA15497@icarus.lis.pitt.edu>


Forgive the review of college math, but there are a few issues, while
obvious to many, might be worth reviewing here...

[1] The gradient of a well-behaved single-valued function 
    of N variables (here the error as a function of the
    weights) is generally orthogonal to an N-1 dimensional
    manifold on which the function is constant (an iso-error 
    surface)

[2] The effect of infinitesimal motion in the space on the
    function can be computed as the inner (dot) product of
    the gradient vector with the movement vector; thus,
    as long as the dot product between the gradient and the
    delta-w vector is negative, the error will decrease.
    That is, the new iso-error surface will correspond to a lower
    error value.

[3] This implies that the signs of the errors is adequate to reduce
    the error, assuming the learning rate is sufficiently small,
    since any two vectors with all components the same sign
    must have a positive inner product! [They lie in the same
    orthant of the space]

Having said all this, I must point out that the argument pertains
only to single patterns.  That is, eliminating the derivative term,
is guaranteed to reduce the error for the pattern that is presented.

Its effect on the error summed over the training set is not 
guaranteed, even for batch learning...  

One more caveat: Of course, if the nonlinear part of the units'
transfer function is non-monotonic (i.e., the sign of the
derivative varies), be sure to throw the derivative back in!

- Paul Munro


From dhw at t13.Lanl.GOV  Thu Feb 11 17:19:13 1993
From: dhw at t13.Lanl.GOV (David Wolpert)
Date: Thu, 11 Feb 93 15:19:13 MST
Subject: new paper
Message-ID: <9302112219.AA23017@t13.lanl.gov>


***************************************************************
        DO NOT FORWARD TO OTHER BOARDS OR LISTS
***************************************************************


The following paper has been placed in neuroprose, under the name
wolpert.nips92.ps.Z. It is a major revision of an earlier preprint
on the same topic. An abbreviated version (2 fewer pages) will
appear in the proceedings of NIPS 92.


0N THE USE OF EVIDENCE IN NEURAL NETWORKS.


David H. Wolpert, Santa Fe Institute


Abstract: The Bayesian evidence approximation, which is closely related
to generalized maximum likelihood, has recently been employed to
determine the noise and weight-penalty terms for training neural nets.
This paper shows that it is far simpler to perform the exact calculation
than it is to set up the evidence approximation. Moreover, unlike that
approximation, the exact result does not have to be re-calculated for
every new data set. Nor does it require the running of complex
numerical computer code (the exact result is closed form). In addition,
it turns out that for neural nets, the evidence procedure's MAP
estimate is *in toto* approximation error. Another advantage of the exact
analysis is that it does not lead to incorrect intuition, like the claim
that one can "evaluate different priors in light of the data". This paper
ends by discussing sufficiency conditions for the evidence approximation
to hold, along with the implications of those conditions. Although couched
in terms of neural nets, the analysis of this paper holds for any Bayesian
interpolation problem.


Recover the file in the usual way:

unix> ftp cheops.cis.ohio-state.edu
Connected to cheops.cis.ohio-state.edu.
220 cheops.cis.ohio-state.edu FTP server ready.
Name: anonymous
331 Guest login ok, send ident as password.
Password: {your address}
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get wolpert.nips92.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for rosenblatt.reborn.ps.Z
226 Transfer complete.
100000 bytes sent in 3.14159 seconds
ftp> quit
221 Goodbye
unix> uncompress wolpert.nips92.ps.Z
unix> lpr wolpert.nips92.ps (or however you print postscript


From mozer at dendrite.cs.colorado.edu  Fri Feb 12 00:10:05 1993
From: mozer at dendrite.cs.colorado.edu (Michael C. Mozer)
Date: Thu, 11 Feb 1993 22:10:05 -0700
Subject: connectionist models summer school -- final call for applications
Message-ID: <199302120510.AA06977@neuron.cs.colorado.edu>

                        FINAL CALL FOR APPLICATIONS

                    CONNECTIONIST MODELS SUMMER SCHOOL

     The University of  Colorado  will  host  the  1993  Connectionist
     Models  Summer  School from June 21 to July 3, 1993.  The purpose
     of the summer school is to provide training  to  promising  young
     researchers  in connectionism (neural networks) by leaders of the
     field and to foster interdisciplinary collaboration.   This  will
     be  the  fourth  such  program  in  a  series  that  was  held at
     Carnegie-Mellon in 1986 and 1988 and at UC  San  Diego  in  1990.
     Previous  summer  schools  have  been extremely successful and we
     look forward to the 1993 session  with  anticipation  of  another
     exciting event.

     The  summer  school  will  offer  courses  in   many   areas   of
     connectionist modeling, with emphasis on artificial intelligence,
     cognitive neuroscience, cognitive science, computational methods,
     and  theoretical  foundations.   Visiting  faculty  (see  list of
     invited faculty below) will present daily lectures and tutorials,
     coordinate  informal workshops, and lead small discussion groups.
     The summer school schedule is designed to allow  for  significant
     interaction  among  students and faculty. As in previous years, a
     proceedings of the summer school will be published.

     Applications will  be  considered  only  from  graduate  students
     currently  enrolled in Ph.D. programs.  About 50 students will be
     accepted.  Admission is on a competitive basis.  Tuition will  be
     covered  for  all  students,  and  we expect to have scholarships
     available to subsidize housing and meal costs, but  students  are
     responsible for their own travel arrangements.

     Applications should include the following materials:

     *  a vita, including mailing address,  phone  number,  electronic
     mail  address,  academic  history, list of publications (if any),
     and relevant courses taken with  instructors'  names  and  grades
     received;

     *  a one-page statement of purpose,  explaining  major  areas  of
     interest  and  prior  background  in  connectionist  modeling and
     neural networks;

     *  two letters of recommendation from individuals  familiar  with
     the  applicants'  work  (either  mailed  separately  or in sealed
     envelopes); and

     *  a statement from the applicant describing potential sources of
     financial  support  available  (department,  advisor,  etc.)  for
     travel expenses.

     Applications should be sent to:

             Connectionist Models Summer School
             c/o Institute of Cognitive Science
             Campus Box 344
             University of Colorado
             Boulder, CO 80309

     All application materials must be  received  by  March  1,  1993.
     Admission  decisions  will  be announced around April 15.  If you
     have specific questions, please write to  the  address  above  or
     send  e-mail  to  "cmss at cs.colorado.edu".   Application materials
     cannot be accepted via e-mail.


     Organizing Committee

     Jeff Elman (UC San Diego)
     Mike Mozer (University of Colorado)
     Paul Smolensky (University of Colorado)
     Dave Touretzky (Carnegie Mellon)
     Andreas Weigend (Xerox PARC and University of Colorado)

     Additional faculty will include:

     Yaser Abu-Mostafa (Cal Tech)
     Sue Becker (McMaster University)
     Andy Barto (University of Massachusetts, Amherst)
     Jack Cowan (University of Chicago)
     Peter Dayan (Salk Institute)
     Mary Hare (Birkbeck College)
     Cathy Harris (Boston University)
     David Haussler (UC Santa Cruz)
     Geoff Hinton (University of Toronto)
     Mike Jordan (MIT)
     John Kruschke (Indiana University)
     Jay McClelland (Carnegie Mellon)
     Ennio Mingolla (Boston University)
     Steve Nowlan (Salk Institute)
     Dave Plaut (Carnegie Mellon)
     Jordan Pollack (Ohio State)
     Dean Pomerleau (Carnegie Mellon)
     Dave Rumelhart (Stanford)
     Patrice Simard (ATT Bell Labs)
     Terry Sejnowski (UC San Diego and Salk Institute)
     Sara Solla (ATT Bell Labs)
     Janet Wiles (University of Queensland)

     The Summer School is sponsored by the  American  Association  for
     Artificial Intelligence, the National Science Foundation, Siemens
     Research Center, and the  University  of  Colorado  Institute  of
     Cognitive Science.

     Colorado has recently passed a law explicitly denying  protection
     for  lesbians,  gays,  and bisexuals.  However, the Summer School
     does not discriminate in admissions on the  basis  of  age,  sex,
     race,  national  origin, religion, disability, veteran status, or
     sexual orientation.


From heiniw at sun1.eeb.ele.tue.nl  Tue Feb  9 05:46:06 1993
From: heiniw at sun1.eeb.ele.tue.nl (Heini Withagen)
Date: Tue, 9 Feb 1993 11:46:06 +0100 (MET)
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302091046.AA08161@sun1.eeb.ele.tue.nl>

A non-text attachment was scrubbed...
Name: not available
Type: text
Size: 191 bytes
Desc: not available
Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/7e2fa3eb/attachment.ksh

From kolen-j at cis.ohio-state.edu  Tue Feb  9 08:46:53 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 08:46:53 -0500
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@sef-pmax.slisp.cs.cmu.edu's message of Sun, 07 Feb 93 12:56:03 EST <9302091257.AA06456@everest.eng.ohio-state.edu>
Message-ID: <9302091346.AA28166@pons.cis.ohio-state.edu>

   From: Scott_Fahlman at sef-pmax.slisp.cs.cmu.edu

   Of course, a learning system implemented in analog hardware might have only
   a few bits of accuracy due to noise and nonlinearity in the circuits, but
   it wouldn't suffer from this quantization effect, since you get a sort of
   probabilistic dithering for free.

This assumes, of course, that the mechanism is actually "computing" using
the available bits.  Bits are the result of binary measurements.  An analog
device does not normally convert voltages or currents into a binary
representation and then operate on it.  An analog mechanism sloppilly
implementing backprop should be able to tweak the weights in the general
direction, but not necessarily the same direction as theoretical backprop.

John Kolen


From KRUSCHKE at ucs.indiana.edu  Tue Feb  9 09:45:45 1993
From: KRUSCHKE at ucs.indiana.edu (John K. Kruschke)
Date: Tue, 9 Feb 93 09:45:45 EST
Subject: postdoctoral traineeships available
Message-ID: <mailman.579.1149540256.24850.connectionists@cs.cmu.edu>


POST-DOCTORAL FELLOWSHIPS AT INDIANA UNIVERSITY 

   Postdoctoral Traineeships in MODELING OF COGNITIVE PROCESSES 

   Please call this notice to the attention of all interested parties.

   The Psychology Department and Cognitive Science Programs at Indiana
University are pleased to announce the availability of one or more
Postdoctoral Traineeships in the area of Modeling of Cognitive
Processes. The appointment will pay rates appropriate for a new PhD
(about $18,800), and will be for one year, starting after July 1,
1993. The duration could be extended to two years if a training grant
from NIH is funded as anticipated (we should receive final
notification by May 1). 

   Post-docs are offered to qualified individuals who wish to further
their training in mathematical modeling or computer simulation
modeling, in any substantive area of cognitive psychology or Cognitive
Science. 

   We are particularly interested in applicants with strong
mathematical, scientific, and research credentials. Indiana University
has superb computational and research facilities, and faculty with
outstanding credentials in this area of research, including Richard
Shiffrin and James Townsend, co-directors of the training program, and
Robert Nosofsky, Donald Robinson, John Castellan, John Kruschke,
Robert Goldstone, Geoffrey Bingham, and Robert Port. 

   Trainees will be expected to carry out original theoretical and
empirical research in association with one or more of these faculty
and their laboratories, and to interact with other relevant faculty
and the other pre- and postdoctoral trainees. 

   Interested applicants should send an up to date vitae, personal
letter describing their specific research interests, relevant
background, goals, and career plans, and reference letters from two
individuals. Relevant reprints and preprints should also be sent.
Women, minority group members, and handicapped individuals are urged
to apply. PLEASE NOTE: The conditions of our anticipated grant
restrict awards to US citizens, or current green card holders. Awards
will also have a 'payback' provision, generally requiring awardees to
carry out research or teach for an equivalent period after termination
of the traineeship. Send all materials to: 

   Professors Richard Shiffrin and James Townsend, 
     Program Directors 
   Department of Psychology, Room 376B 
   Indiana University
   Bloomington, IN 47405 
		
   We may be contacted at: 
   812-855-2722; 
   Fax: 812-855-4691    
   email: shiffrin at ucs.indiana.edu 

Indiana University is an Affirmative Action Employer 


From kenm at prodigal.psych.rochester.edu  Tue Feb  9 10:50:49 1993
From: kenm at prodigal.psych.rochester.edu (Ken McRae)
Date: Tue, 9 Feb 93 10:50:49 EST
Subject: paper available
Message-ID: <9302091550.AA20269@prodigal.psych.rochester.edu>


The following paper is now available in pub/neuroprose.


   Catastrophic Interference is Eliminated in Pretrained Networks

                          Ken McRae
                   University of Rochester
                              &
                     Phil A. Hetherington
                       McGill University
 

When modeling strictly sequential experimental memory tasks, such as
serial list learning, connectionist networks appear to experience
excessive retroactive interference, known as catastrophic interference
(McCloskey & Cohen,1989; Ratcliff, 1990).  The main cause of this
interference is overlap among representations at the hidden unit layer
(French, 1991; Hetherington,1991; Murre, 1992).  This can be alleviated by
constraining the number of hidden units allocated to representing each
item, thus reducing overlap and interference (French, 1991; Kruschke,
1992).  When human subjects perform a laboratory memory experiment, they
arrive with a wealth of prior knowledge that is relevant to performing the
task.  If a network is given the benefit of relevant prior knowledge, the
representation of new items is constrained naturally, so that a sequential
task involving novel items can be performed with little interference. 
Three laboratory memory experiments (ABA free recall, serial list, and ABA
paired-associate learning) are used to show that little or no interference
is found in networks that have been pretrained with a simple and relevant
knowledge base.  Thus, catastrophic interference is eliminated when
critical aspects of simulations are made to be more analogous to the
corresponding human situation. 
 

Thanks again to Jordan Pollack for maintaining this electronic library.


An example of how to retrieve mcrae.pretrained.ps.Z:

your machine> ftp archive.cis.ohio-state.edu
Connected to archive.cis.ohio-state.edu.
220 archive FTP server (Version 6.15 Thu Apr 23 15:28:03 EDT 1992) ready.
Name (archive.cis.ohio-state.edu:kenm): anonymous
331 Guest login ok, send e-mail address as password.
Password:
230 Guest login ok, access restrictions apply.
ftp> cd pub/neuroprose
250-Please read the file README
250-  it was last modified on Mon Feb 17 15:51:43 1992 - 357 days ago
250-Please read the file README~
250-  it was last modified on Wed Feb  6 16:41:29 1991 - 733 days ago
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get mcrae.pretrained.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for mcrae.pretrained.ps.Z (129046 bytes).
226 Transfer complete.
local: mcrae.pretrained.ps.Z remote: mcrae.pretrained.ps.Z
129046 bytes received in 30 seconds (4.2 Kbytes/s)
ftp> quit
221 Goodbye.
your machine> uncompress mcrae.pretrained.ps.Z
your machine>  then print the file


From kolen-j at cis.ohio-state.edu  Tue Feb  9 13:31:43 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 13:31:43 -0500
Subject: Test & Derivatives in Backprop
Message-ID: <9302091831.AA00142@pons.cis.ohio-state.edu>

[I hope that this makes it to connectionists, the last couple of postings
 haven't made it back.  So I have summarized these replies in one message
 for general consumption.]

Regarding the latest talk about derivatives in backprop, I had looked into
replacing the different mathematical operations with other, more
implementation-amenable operations.  This included replacing the
derivative of the squashing function with d(x)=min(x,1-x).  The results of
these tests show that backprop is pretty stable as long as the qualitative
shape of the operations are maintained.  If you replace the derivative with
a constant or linear (wrt activation) function it doesn't work at all for
the learning tasks I considered.  As long as the derivative replacement is
minimal in the extreme activations and maximal at 0.5 (wrt the traditional
sigmoid), the operation will not suffer dramatically.  

After reading Fahlman's observation about loosing bits to noise I had the
following response.  Bits come from binary decisions.  Analog systems
don't do that in normal processing, normally some continuous value affects
another continuous value.  No where do they perform A/D conversion and then
operate on the bits.  If there is no measurement device, then talking about
bits doesn't make sense.

John Kolen


From guy at cs.uq.oz.au  Tue Feb  9 17:25:35 1993
From: guy at cs.uq.oz.au (guy@cs.uq.oz.au)
Date: Wed, 10 Feb 93 08:25:35 +1000
Subject: Does backprop need the derivative ??
Message-ID: <9302092225.AA06661@client>


The question has been asked whether the full derivative is needed for
backprop to work, or whether the sign of the derivative is sufficient. 

As far as I am aware, the discussion has not defined at what point the
derivative is truncated to +/-1. This might occur (1) for each input/output
pair when the error is fed into the output layer, (2) in epoch based
learning, the exact derivative of each weight over the training set might be
computed, but the update to the weight truncated, or (3...) many
intermediate cases.

I believe one problem with limited precision weights is as follows. The
magnitude of the update may be smaller than the limit of precision on the
weight (which has much greater magnitude). If the machine arithmetic then
rounds the updated weight to the nearest representable value, the updated
weight will be rounded to its old value, and no learning will occur. 

I am co-author of a technical report which addressed this problem. In our
algorithm, weights had very limited precision but their derivatives over the
whole training set were computed exactly. The weight update step would shift
the weight value to the next representable value with a probability
proportional to the size of the derivative. 

In our inexhaustive testing, we found that very limited precision weights
and activations could be used. 

The technical report is available in hardcopy (limited numbers) and
postscript. My addresses are "guy at cs.uq.oz.au" and "Guy Smith, Department of
Computer Science, The University of Queensland, St Lucia 4072, Australia".

Guy Smith. 


From meng at spring.kuee.kyoto-u.ac.jp  Wed Feb 10 11:58:19 1993
From: meng at spring.kuee.kyoto-u.ac.jp (meng@spring.kuee.kyoto-u.ac.jp)
Date: Wed, 10 Feb 93 11:58:19 JST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU's message of Sun, 07 Feb 93 13:02:42 EST <9302091925.AA12414@ntt-sh.ntt.jp>
          9 Feb 93 1:36:51 EST
          9 Feb 93 1:35:08 EST
          7 Feb 93 13:03:24 EST
Message-ID: <9302100258.AA20634@spring.kuee.kyoto-u.ac.jp>


Thinking about it, it seems that the derivative always can be replaced
by a sufficiently small constant. I.e., for a certain training set and
a certain requirement of precision on the ouput units, you can find a
constant that is smaller than a certain constant that, with the same
starting point, will find the same minimum for the same network as an
algorithm that is using the derivative. The problem with this of
course is that the constant may be so small that the training time
may be prohibitive, while the motivation to such a constant is to speed up
training. The reason that this works in a lot of instances is,
I think, that the requirement of precision is wide enough to let the
network jump into a region that is sufficiently close to a minimum.
A situation where it wouldn't work, would be a situation where the
network is moving in the right direction, but jumping too far, i.e.
jumping from one side of a valley to the other alternately, never landing
within a region that would give convergence within the requirements set.
The use of the derivative solves this by getting smaller when approaching
a minimum.

Another possibility is that using a constant the network might settle
in another minimum (or try to settle in another ("wider") minimum) by
virtue of "seeing" the error surface as more coarse grained than the
version using a derivative. In some cases, if you're lucky (i.e. has
a good initial state in relation to a minimum and the constant you're
using) you might hit bull's eye, with another initial state you might be
oscillating around the solution (i.e. having the error go up and down
without getting within the required limit). In such a case you could switch
to using the derivative or simply decrease the constant (maybe how much
could be computed on the basis of the increase in error? Just an idea).

These are just some thoughts on the subject, no empirical study undertaken.

Tore


From "MVUB::BLACK%hermes.mod.uk" at relay.MOD.UK  Fri Feb 12 09:50:00 1993
From: "MVUB::BLACK%hermes.mod.uk" at relay.MOD.UK (John V. Black @ DRA Malvern)
Date: Fri, 12 Feb 93  14:50 GMT
Subject: IEE Third International Conference on ANN's (Registration Announcement)
Message-ID: <mailman.580.1149540256.24850.connectionists@cs.cmu.edu>

CONFERENCE ANNOUNCEMENT
=======================
         
IEE Third International Conference on
 Artificial Neural Networks            

Brighton, UK, 25-27 May 1993.  
  
-----------------------------------------------------
This conference, organised by the Institute of Electrical Engineers
will cover up-to-date reports on the curent state of research
on Artificial Neural Networks, including theoretical understanding
of fundamental structures, learning algorithms, implementation and 
applications.
Over 70 papers willl  be presented in formal and poster sessions
under the following headings

APPLICATIONS                    ARCHITECTURES
VISION                          CONTROL & ROBOTICS
MEDICAL SYSTEMS                 NETWORK ANALYSIS

In addition there will be a small exhibition and publishers display,  
Civic Reception and Conference Dinner.  

Registration fees are as follows:

Member(IEE/associated societies)      235 pounds sterling (inc 35 pounds vat) 
Non-member                            294   "       "     (inc 43.79 "    ")
Research Student or Retired            83   "       "     (inc 12.36 "    ")


Further information including full programme available  from

Sheila Griffiths
ANN93 Secretariat
Conferemce Services
Institute of Electrical Engineeers
Savoy Place
London WC2R 0BL, UK
Tel: 071 344 5478/5477
Fax: 071 497 3633
Telex: 261776 IEE LDN G       


       John Black (jvb%hermes.mod.uk at relay.mod.uk)
			E-mailing for David Lowe


From kolen-j at cis.ohio-state.edu  Fri Feb 12 08:11:58 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Fri, 12 Feb 93 08:11:58 -0500
Subject: Does backprop need the derivative ??
In-Reply-To: Mark Evans's message of Thu, 11 Feb 93 10:26:03 GMT <3468.9302111026@it-research-institute.brighton.ac.uk>
Message-ID: <9302121311.AA20446@pons.cis.ohio-state.edu>


When I used the term stable in my previous posting, I did not entail the
mathematical notion of stability when applied to a control system.  What I
meant was the apparent behavior of the network, learning a set of
associations of patterns, was unaffected by quantitative changes in these
operations.  An analogy I often use is the symbolic dynamics of unimodal
iterated function systems.  As long as small number of qualitative
conditions are true, then the system will exhibit the same symbol dynamics
as other functions for which the conditions hold regardless of the
numerical differences between functions.  Thus the bifurcation diagrams of
rx(1-x) and a bump made up of sigmoids will exhibit the same type of period
doubling cascaded.  

Even if it wasn't mathematically stable, but was guaranteed to pass through
a region of weight space with usable weights, most of the NN community
would find it useful.

John


From shim at marlin.nosc.mil  Fri Feb 12 13:00:08 1993
From: shim at marlin.nosc.mil (Randy L. Shimabukuro)
Date: Fri, 12 Feb 93 10:00:08 -0800
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302121800.AA01359@marlin.nosc.mil>

Congratulations on initiating a very lively discussion. From reading the
responses though, it appears that people are interpreting your question
differently. At the risk of adding to the confusion let me try to
explain.

It seems that some people are talking about the derivative of the
transfer function (F') and while others are talking about the gradient
of the error function. We have looked at both cases:

We approximate F' in a manner similar to that suggested by George Bolt.
Letting F'(|x|) -> 1 for |x|<r, and F'(|x|) -> a for |x|>=r. Where a is
a small positive constant, and r is a point where F'(r) is approximately
1.

We have also, in a sense, approximated the gradient of the error
function by quantizing the weight updates. This is similar to what
Peterson and Hartman call "Manhattan updating". In this case it is
important to preserve the sign of the derivative.

We have found that the first type of approximation has very little
effect of back propagation. Depending on the problem, the second type
sometimes shortens the learning time and sometimes prevents the network
from learning. In some cases it helps to decrease the size of the
updates as learning progresses.

                        Randy Shimabukuro


From hartman%pav.mcc.com at mcc.com  Sat Feb 13 17:36:04 1993
From: hartman%pav.mcc.com at mcc.com (E. Hartman)
Date: Sat, 13 Feb 93 16:36:04 CST
Subject: Re. does bp need the derivative?
Message-ID: <9302132236.AA01583@energy.pav.mcc.com>


Re. the question of the derivative in backprop, Javier Movellan 
and Randy Shimabukuro mentioned the "Manhattan updating" dicussed 
in Peterson and Hartman ("Explorations of the Mean Field Theory
Learning Algorithm", Neural Networks Vol.2 pp 475-494 1989).

This technique computes the gradient exactly, but then keeps only
the signs of the components and takes fixed-size weight steps (each 
weight is changed by a fixed amount, either up or down).  

We used this technique to advantage, both in backprop and mean field 
theory nets, on problems with inconsistent data -- data containing 
exemplars with identical inputs but differing outputs (one-to-many 
mapping).  (The problem in the paper was a classification problem 
drawn from overlapping gaussian distributions). 

The reason that this technique helped on this kind of problem is the 
following.  Since the data was highly inconsistent, we found that 
before taking a step in weight space, it helped to average out the 
data inconsistencies by accumulating the gradient over a large number 
of patterns (large batch training).  But, typically, it happens that 
some components of the gradient don't "average out" nicely and instead
became very large.  So the components of the gradient vary greatly 
in magnitude, which makes choosing a good learning rate difficult.  
"Manhattan updating" makes all the components equal in magnitude.  

We found it necessary to slowly reduce the step size as training proceeds.

Eric Hartman


From marwan at sedal.su.oz.au  Sat Feb 13 03:03:55 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Sat, 13 Feb 1993 19:03:55 +1100
Subject: Test & Derivatives in Backprop
Message-ID: <9302130803.AA03429@sedal.sedal.su.OZ.AU>

> From: john kolen <kolen-j at cis.ohio-state.edu>
> 
> [I hope that this makes it to connectionists, the last couple of postings
>  haven't made it back.  So I have summarized these replies in one message
>  for general consumption.]
> 
> Regarding the latest talk about derivatives in backprop, I had looked into
> replacing the different mathematical operations with other, more
> implementation-amenable operations.  This included replacing the
> derivative of the squashing function with d(x)=min(x,1-x).  The results of
> these tests show that backprop is pretty stable as long as the qualitative
> shape of the operations are maintained.  If you replace the derivative with
> a constant or linear (wrt activation) function it doesn't work at all for
> the learning tasks I considered.  As long as the derivative replacement is
> minimal in the extreme activations and maximal at 0.5 (wrt the traditional
> sigmoid), the operation will not suffer dramatically.  
> 
> After reading Fahlman's observation about loosing bits to noise I had the
> following response.  Bits come from binary decisions.  Analog systems
> don't do that in normal processing, normally some continuous value affects
> another continuous value.  No where do they perform A/D conversion and then
> operate on the bits.  If there is no measurement device, then talking about
> bits doesn't make sense.
> 
> John Kolen
> 

Are we talking about analog implementations? I hope so because I am.
If not, then forget this message.

The derivative issue boils down to whether you can implement cheaply,
whatever is the approximation. The implication on the training speed
depends on  how good your gradient approximations are.

The bit-width issue boils down to how you will implement your storage
(weights).  Whether you use analog EEPROM, RAM converted with DACs 
or whatever, you have to deal with bit effects. Except
if you have a new analog high precision storage device that can be implemented
cheaply, in which case I will be eager to learn about.

If you have the analog dream device, then your next problem in analog
implementation is the signal/noise ratio. Except if your analog circuits
are noisyless.

	Marwan

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From miller at picard.ads.com  Mon Feb 15 11:32:44 1993
From: miller at picard.ads.com (Kenyon Miller)
Date: Mon, 15 Feb 93 11:32:44 EST
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302151632.AA03270@picard.ads.com>


Paul Munro writes:

> [3] This implies that the signs of the errors is adequate to reduce
>     the error, assuming the learning rate is sufficiently small,
>     since any two vectors with all components the same sign
>     must have a positive inner product! [They lie in the same
>     orthant of the space]

I beleive a critical point is being missed, that is, the derivative
is being replaced by sign at every stage in applying the chain rule,
not just to the initial backpropagation of the error.  Consider the
following example:

      ----n2-----
     /           \
w--n1             n4
     \           /
      ----n3-----

In other words, there is an output neuron n4 which is connected to two
neurons n2 and n3, each of which is connected to neuron n1, which has
a weight w.  Suppose the weight connecting n2 to n4 is negative and all
other connections in the diagram are positive.  Suppose further that
n2 is saturated and none of the other neurons are saturated.  Now,
suppose that n4 must be decreased in order to reduce the error.
Backpropagating along the n4-n2-n1 path, w receives an error term
which would tend to increase n1, while backpropagating along the
n4-n3-n1 path would result in a term which would tend to decrease n1.
If the true sigmoid derivative were used, the force to increase n1
would be dampened because n2 is saturated, and the net result would
be to increase w and therefore increase n1 and n3 and decrease n4.
However, replacing the sigmoid derivative with a constant could easily
allow the n4-n2-n1 path to dominate, and the error at the output would
increase.  Thus, it is not a sound thing to do regardless of how
many patterns are used for training.


  -Ken Miller.


From kanal at cs.UMD.EDU  Mon Feb 15 12:35:27 1993
From: kanal at cs.UMD.EDU (Laveen N. Kanal)
Date: Mon, 15 Feb 93 12:35:27 -0500
Subject: non-Turing machines?
Message-ID: <9302151735.AA10355@mimsy.cs.UMD.EDU>


I have only tuned into part of the quantum computers discussion and so I don't
know if the following references have been mentioned in the discussion.
Having speculated about natural perception not being modelable by Turing
machines, I was not surprised to find similar speculation in the book
Renewing Philosophy by Hilary Putnam (Harvard Univ. Press, 1992) which I
picked up at the bookstore the other day. But Putnam does cite two specific
refrences which may be of interest in this context.

Marian Boykan Pour-El and Ian Richards, " The Wave Equation with  Computable
Initial Data Such That Its Unique Solution Is Not Computable," Advances in
Mathematics, 39 (1981) p. 215-239

Georg Kreisel's review of the above paper in The Journal of Symbolic Logic,
47, No. 4, (1982) p. 900-902.


From ala at sans.kth.se  Tue Feb 16 07:51:57 1993
From: ala at sans.kth.se (Anders Lansner)
Date: Tue, 16 Feb 1993 13:51:57 +0100
Subject: MCPA'93 Call for Contributions
Message-ID: <199302161251.AA02772@occipitalis.sans.kth.se>


MCPA'93  Final Call

****************************************************************************
*                         Invitation to                                    *
*       International Workshop on Mechatronical Computer Systems           *
*               for Perception and Action, June 1-3, 1993                  *
*                       Halmstad University,  Sweden                       *
*                                                                          *
*                     Final Call for Contributions                         *
**************************************************************************** 

Mechatronical Computer Systems that Perceive and Act - A New Generation
=======================================================================

Mechatronical computer systems, which we will see in 
advanced products and production equipment of tomorrow, 
are designed to do much more than calculate. The interaction 
with the environment and the integration of computational 
modules in every part of the equipment, engaging in every 
aspect of its functioning, put new, and conceptually different, 
demands on the computer system. A development towards a 
complete integration between the mechanical system, 
advanced sensors and actuators, and a multitude of process-
ing modules can be foreseen. At the systems level, powerful 
algorithms for perceptual integration, goal-direction and 
action planning in real time will be critical components. The 
resulting action-oriented systems may interact with their 
environments by means of sophisticated sensors and actua-
tors, often with a high degree of parallelism, and may be able 
to learn and adapt to different circumstances and environ-
ments. Perceiveing the objects and events of the external 
world and acting upon the situation in accordance with an 
appropriate behaviour, whether programmed, trained, or 
learned, are key functions of these, next generation, compu-
ter systems.

The aim of this first International Workshop on Mechatronical 
Computer Systems for Perception and Action is to gather 
researchers and industrial development engineers, who work 
with different aspects of this exciting new generation of com-
puting systems and computer-based applications, to a fruitful 
exchange of ideas and results and, often interdisciplinary, dis-
cussions. 

Workshop Form
=============

One of the days of the workshop will be devoted to true work-
shop activities. The objective is to identify and propose 
research directions and key problem areas in mechatronical 
computing systems for perception and action. In the morning 
session, invited speakers, as well as other workshop dele-
gates, will give their perspectives on the theme of the work-
shop. The work will proceed in smaller working groups during 
the afternoon, after which the conclusions will be presented in 
a plenary session.

The scientific programme will also include presentations of 
research results in oral or poster form, or as demonstrations.

Subject Areas
=============

Relevant subject areas are e.g.:

Real-Time Systems Architecture and Real-Time Software.

Sensor Systems and Sensory/Motor Coordination.

Biologically Inspired Systems.

Applications of Unsupervised and Reinforcement Learning.

Real-Time Decision Making and Action Planning.

Parallel Processor Architectures for Embedded Systems.

Development Tools and Support Systems for Mechatronical 
Computer Systems and Applications.

Dependable Computer Systems.

Robotics and Machine Vision.

Neural Networks in Real-Time Applications.

Advanced Mechatronical Computing Demands in Industry.

Contributions to the Workshop
=============================

The programme committee welcomes all kinds of contribu-
tions - papers to be presented orally or as posters, demon-
strations, etc. - in the areas listed above, as well as other 
areas of relevance to the theme of the workshop.

>From the workshop point of view, it is NOT essential that con-
tributions contain only new, unpublished results. Rather, the 
new, interdisciplinary collection of delegates that can be 
expected at the workshop may motivate presentations of ear-
lier published results.

Specifically, we invite delegates to state their view of the 
workshop theme, including identification of key research 
issues and research directions. The planning of the workshop 
day will be based on these submitted statements , some of 
which will be presented in the plenary session, some of which 
in the smaller working groups.

DEADLINES
=========

Febr. 26, 1993: Submissions of extended abstracts or full 
papers. Submissions of statements regarding perspectives 
on the conference theme, that the delegate would like to 
present at the workshop (4 pages max). Submissions of 
descriptions of demonstrations, etc.

March 19, 1993: Notification of acceptance. Preliminary final 
programme.

May 1, 1993: Final papers and statements.

All submissions shall be sent to the workshop secretariat, see 
address box. Please send two copies. Submissions must 
include name(s) and affiliation(s) of author(s) and full 
address, including phone and fax number and electronic mail 
address (if possible).

The accepted papers and statements will be assembled into 
a Proceedings book given to the Workshop attendees. After 
the workshop a revised version of the proceedings, including 
results of the workshop discussions, will be published by an 
international publisher.

Invited speakers
================

Prof. John A. Stankovic, University of Massachusetts, USA, 
and Scuola Superiore S. Anna, Pisa, Italy:

"Major Real-Time Challenges for Mechatronical Systems"

Prof. Jan-Olof Eklundh, CVAP, Royal Institute of Technology, 
Stockholm, Sweden:

"Computer Vision and Seeing Systems"

Prof. Dave Cliff, School of Cognitive and Computing Sciences
and Neuroscience IRC, University of Sussex, U.K.

"Animate Vision in an Artificial Fly: A Study in Computational Neuroethology" &
"Visual Sensory-Motor Networks Without Design: Evolving Visually Guided Robots"

(More invited speakers to be confirmed.)

ORGANISERS
==========

The workshop is arranged by CCA, the Centre for Computer 
Architecture at Halmstad University, Sweden, in cooperation 
with the DAMEK Mechatronics Research Group and the 
SANS (Studies of Artificial Neural Systems) Research Group, 
both at the Royal Institute of Technology (KTH), Stockholm, 
Sweden, and the Department of Computer Engineering, 
Chalmers University of Technology, Gothenburg, Sweden.

The Organising Committee includes:

Lars Bengtsson, CCA, Organising Chair

Anders Lansner, SANS

Kenneth Nilsson, CCA

Bertil Svensson, Chalmers University of Technology and 
CCA, Programme and Conference Chair

Per-Arne Wiberg, CCA

Jan Wikander, DAMEK

The workshop is supported by SNNS, the Swedish Neural 
Network Society.

It is financially supported by Halmstad University, the County 
Administration of Halland, Swedish industries and NUTEK 
(the Swedish National Board for Industrial and Technical 
Development).

Programme Committee
===================

Bertil Svensson, Sweden (chair)

Paolo Ancilotti, Italy

Lars Bengtsson, Sweden

Giorgio Buttazzo, Italy

Robert Forchheimer, Sweden

Anders Lansner, Sweden

Kenneth Nilsson, Sweden

John Stankovic, Italy and USA   

Jan Torin, Sweden

Hendrik van Brussel, Belgium

Per-Arne Wiberg, Sweden

Jan Wikander, Sweden

Workshop Language: English

Workshop fee: SEK 2 000, incl. proceedings, lunch-
eons, reception and workshop dinner. Early registration 
(before April 20) SEK 1750.

The number of attendees to the workshop is limited. Among 
those not submitting a contribution attendance will be given 
on a first-come, first-served basis.

Social Activities
=================

Reception, workshop dinner.

Deep sea fishing tour or a visit at Varberg castle/fortress 
and museum.

Bring your family, a programme for accompanying persons 
will be arranged.

How to get there
================

Halmstad is situated on the west coast of Sweden between 
Copenhagen and Gothenburg (major international airports). 
With a distance of 150 kilometres to each of these cities it is 
easy and convenient to reach Halmstad by train, bus or car. 
Halmstad Airport is linked to Stockholm International Airport 
(Arlanda). Flight time Stockholm - Halmstad is 50 minutes.

Accomodation
============

Arrangements will be made with local hotels, both downtown 
Halmstad and at the seaside. Different price categories will be 
available. Please let us know what price category and loca-
tion you prefer and we help you with the booking. Payment is 
made directly to the hotel. Prices (breakfast included) in SEK:

CATEGORY 1: SEK 750-850 single room, 750-950 double room

	Downtown	Single room ( )		Double room ( )

	Seaside		Single room ( )		Double room ( )

CATEGORY 2: SEK 400 single room, 450 double room

	Near town	Single room ( )		Double room ( )

Transportation between the hotels and the University will be 
arranged.


( )  I register already now. Send preliminary programme when available.

( )  I do not register yet but want the preliminary programme when available.


Name ...................................................    

......................................................	

Address.................................................

.......................................................	

.......................................................	

Tel., Fax, e-mail        ....................................

........................................................


-------------------------------------------------------------------------

MCPA Workshop
Centre for Computer Architecture
Halmstad University
Box 823
S-30118 HALMSTAD
Sweden

Tel. +46 35 153134 (Lars Bengtsson)
Fax. +46 35 157387
email: mcpa at cca.hh.se
------------------------------------------------------------------------

END OF MESSAGE


From harris at ai.mit.edu  Tue Feb 16 18:50:28 1993
From: harris at ai.mit.edu (John G. Harris)
Date: Tue, 16 Feb 93 18:50:28 EST
Subject: Postdoc position in computational/biological vision (learning)
Message-ID: <9302162350.AA05713@portofino>


One (or possibly two) postdoctoral positions are available for one or two
years in computational vision starting September 1993 (flexible).  The
postdoc will work in Lucia Vaina's laboratory at Boston University, College
of Engineering, to conduct research in learning the direction in global
motion.  The researchers currently involved in this project are Lucia M.
Vaina, John Harris, Charlie Chubb, Bob Sekuler, and Federico Girosi.

Requirements are PhD in CS or related area with experience in visual
modeling or psychophysics.  Knowledge of biologically relevant neural models
is desirable.  Stipend ranges from $28,000 to $35,000 depending upon
qualifications.  Deadline for application is March 1, 1993.  Two letter of
recommendation, description of current research and an up to date CV are
required.

In the research we combine computational psychophysics, neural networks
modeling and analog VLSI to study visual learning specifically applied to
direction in global motion. The global motion problem requires estimation of
the direction and magnitude of coherent motion in the presence of noise.  We
are proposing a set of psychophysical experiments in which the subject, or
the network must integrate noisy, spatially local motion information from
across the visual field in order to generate a response.  We will study the
classes of neural networks which best approximate the pattern of learning
demonstrated in psychophysical tasks. We will explore Hebbian learning,
multilayer perceptrons (e.g. backpropagation), cooperative networks, Radial
Basis Function and Hyper-Basis Functions. The various strategies and their
implementation will be evaluated on the basis of their performance and their
biological plausibility.

For more details, contact Prof. Lucia M. Vaina at vaina at buenga.bu.edu or
lmv at ai.mit.edu.


From learn at galaxy.huji.ac.il  Wed Feb 17 09:37:58 1993
From: learn at galaxy.huji.ac.il (learn conference)
Date: Wed, 17 Feb 93 16:37:58 +0200
Subject: Learning Days in Jerusalem
Message-ID: <9302171437.AA04425@galaxy.huji.ac.il>


========== DEADLINE FOR SUBMISSIONS:  March 1, 1993 ==========================


                     THE HEBREW UNIVERSITY OF JERUSALEM
                     THE CENTER FOR NEURAL COMPUTATION

                        LEARNING DAYS IN JERUSALEM
     Workshop on Fundamental Issues in Biological and Machine Learning

                         May 30 - June 4, 1993 
                  Hebrew University, Jerusalem, Israel

The Center for Neural Computation at the Hebrew University is a new multi-
disciplinary research center for collaborative investigation of the principles 
underlying computation and information processing in the brain and in neuron-
like artificial computing systems.  The Center's activities span theoretical 
studies of neural networks in physics, biology and computer science;
experimental investigations in neurophysiology, psychophysics and cognitive 
psychology; and applied research on software and hardware implementations. 

The first international symposium sponsored by the Center will be held in the 
spring of 1993, at the Hebrew University of Jerusalem.  It will focus on 
theoretical, experimental and practical aspects of learning in natural and 
artificial systems.

Topics for the meeting include:
Theoretical Issues in Supervised and Unsupervised Learning
Neurophysiological Mechanisms Underlying Learning
Cognitive Psychology and Learning Psychophysics
Applications of Machine and Neural Network Learning

Invited speakers include:
Moshe Abeles (Hebrew U.)                Yann LeCun (AT&T)
Aharon Agranat (Hebrew U.)              Joseph LeDoux (NYU) 
Ehud Ahissar (Weizmann Inst.)           Christoph von der Malsburg (U. Bochum)
Asher Cohen (Hebrew U.)                 Yishai Mansour (Tel Aviv U.) 
Yuval Davidor (Weizmann Inst.)          Bruce McNaughton (U. of Arizona)
Yadin Dudai (Weizmann Inst.)            Helge Ritter (U. Bielefeld)
Martha Farah (U. Penn) 			David Rumelhart (Stanford)
David Haussler (UCSC)                   Dov Sagi (Weizmann Inst.) 
Nathan Intrator (Tel Aviv U.)           Menachem Segal (Weizmann Inst.) 
Larry Jacoby (McMaster U.)              Alex Waibel (CMU, U. Karlsruhe) 
Michael Jordan (MIT)                    Norman Weinberger (U.C. Irvine)


Participation in the Workshop is limited to 100.
A small number of contributed papers will be accepted.
 
Interested researchers and students are asked to submit registration forms 
by **** March 1, 1993,***** to:
 
Sari Steinberg Bchiri                Tel:    (972) 2 584563
Center for Neural Computation        Fax:    (972) 2 584437
c/o Racah Institute of Physics       E-mail: learn at galaxy.huji.ac.il
Hebrew University
91904 Jerusalem, Israel

To ensure participation, please send a copy of the registration form by e-mail
or fax as soon as possible.

Organizing Committee: Shaul Hochstein, Haim Sompolinsky, Naftali Tishby.


--------------------------------------------------------------------------------

                              REGISTRATION FORM

Please complete the following form. To ensure participation, please send a copy 
of this form by e-mail or fax as soon as possible to:

Sari Steinberg Bchiri                     E-MAIL: learn at galaxy.huji.ac.il
Center for Neural Computation             TEL: 972-2-584563
c/o Racah Institute of Physics            FAX: 972-2-584437 
Hebrew University
91904 Jerusalem, Israel             Registration will be confirmed by e-mail.
                
                          
                            CONFERENCE REGISTRATION

Name: _________________________________________________________________________

Affiliation: __________________________________________________________________

Address: ______________________________________________________________________

City: __________________ State: ______________ Zip: _________ Country: ________

Telephone: (____)________________ E-mail address:  ____________________________


                              REGISTRATION FEES

____ Regular registration (before March 1): $100
____ Student registration (before March 1): $50
____ Late registration (after March 1): $150
____ Student late registration (after March 1): $75

Please send payment by check or international money order in US dollars made 
payable to: Learning Workshop with this form by March 1, 1993 to avoid late fee.


                               ACCOMMODATIONS 

If you are interested in assistance in reserving hotel accommodation for the 
duration of the Workshop, please indicate your preferences below:

I wish to reserve a single/double (circle one) room from __________ 
to __________, for a total of _______ nights.

                             CONTRIBUTED PAPERS

A very limited number of contributed papers will be accepted. Participants
interested in submitting papers should complete the following and enclose
a 250-word abstract.
Poster/Talk (circle one)

Title: __________________________________________________________________
       __________________________________________________________________


From kak at max.ee.lsu.edu  Wed Feb 17 13:26:28 1993
From: kak at max.ee.lsu.edu (Dr. S. Kak)
Date: Wed, 17 Feb 93 12:26:28 CST
Subject: Reprints
Message-ID: <9302171826.AA05612@max.ee.lsu.edu>


Reprints of the following article are now available:

----------------------------------------------------------------
Ciruits, Systems, & Signal Processing, vol. 12, 1993, pp. 263-278
----------------------------------------------------------------
Feedback Neural Networks: New Characteristics and a Generalization

Subhash C. Kak

Department of Electrical and Computer Engineering
Louisiana State University, Baton Rouge, LA 70803, USA

ABSTRACT

New characteristics of feedback neural networks are studied.
We discuss in detail the question of updating of neurons
given incomplete information about the state of the neural network.
We show how the mechanism of self-indexing 
[Self-indexing of neural
memories, Physics Letters A, Vol. 143, 293-296, 1990.]
for such updating provides better results than assigning 'don't know'
values to the missing parts of the state vector.
Issues related to the choice of the neural model
for a feedback network are also considered.
Properties of a new complex valued neuron model that generalizes
McCulloch-Pitts neurons are examined.
-----
Note: This issue of the journal is devoted exclusively to
articles on neural networks.


From radford at cs.toronto.edu  Wed Feb 17 15:15:58 1993
From: radford at cs.toronto.edu (Radford Neal)
Date: Wed, 17 Feb 1993 15:15:58 -0500
Subject: Paper on "A new view of the EM algorithm"
Message-ID: <93Feb17.151609edt.555@neuron.ai.toronto.edu>


The following paper has been placed in the neuroprose archive, as the
file 'neal.em.ps.Z':


             A NEW VIEW OF THE EM ALGORITHM THAT JUSTIFIES 
                    INCREMENTAL AND OTHER VARIANTS
  
                Radford M. Neal and Geoffrey E. Hinton

                    Department of Computer Science 
                         University of Toronto 

  We present a new view of the EM algorithm for maximum likelihood
  estimation in situations with unobserved variables.  In this view,
  both the E and the M steps of the algorithm are seen as maximizing a
  joint function of the model parameters and of the distribution over
  unobserved variables.  From this perspective, it is easy to justify an
  incremental variant of the algorithm in which the distribution for
  only one of the unobserved variables is recalculated in each E step.
  This variant is shown empirically to give faster convergence in a
  mixture estimation problem.  A wide range of other variant algorithms
  are also seen to be possible.


The PostScript for this paper may be retrieved in the usual fashion:

  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get neal.em.ps.Z
  ftp> quit
  unix> uncompress neal.em.ps.Z
  unix> lpr neal.em.ps (or however you print PostScript files)

Many thanks to Jordan Pollack for providing this service!

  Radford Neal


From gem at cogsci.indiana.edu  Thu Feb 18 09:03:23 1993
From: gem at cogsci.indiana.edu (Gary McGraw)
Date: Thu, 18 Feb 93 09:03:23 EST
Subject: Letter Spirit technical report available
Message-ID: <mailman.581.1149540257.24850.connectionists@cs.cmu.edu>

  The following technical report from the Center for Research on
Concepts and Cognition is available by ftp (only).  Although the
project described in the paper is not connectionism per se, it shares
many of the same philosophical convictions.
----------------------------------------------------------------------
			  Letter Spirit: 
	  An Emergent Model of the Perception and Creation 
		        of Alphabetic Style

	          Douglas Hofstadter & Gary McGraw

The Letter Spirit project explores the creative act of artistic
letter-design.  The aim is to model how the $26$ lowercase letters of
the roman alphabet can be rendered in many different but internally
coherent styles.  Viewed from a distance, the behavior of the program
can be seen to result from the interaction of four emergent agents
working together to form a coherent style and to design a complete
alphabet: the Imaginer (which plays with the concepts behind
letterforms), the Drafter (which converts ideas for letterforms into
graphical realizations), the Examiner (which combines bottom-up and
top-down processing to perceive and categorize letterforms), and the
Adjudicator (which perceives and dynamically builds a representation
of the evolving style).  Creating a gridfont is an iterative process
of guesswork and evaluation carried out by the four agents.  This
process is the ``central feedback loop of creativity''.
Implementation of Letter Spirit is just beginning.  This paper
outlines our goals and plans for the project.
---------------------------------------------------------------------------
The paper is available by anonymous ftp from:
cogsci.indiana.edu  (129.79.238.12)  
		      as pub/hofstadter+mcgraw.letter-spirit.ps.Z
and in neuroprose:
archive.cis.ohio-state.edu (128.146.8.52)
                      as pub/neuroprose/hofstadter.letter-spirit.ps.Z

  Unfortunately, we are not able to distribute hardcopy at this time.  
*---------------------------------------------------------------------------*
|   Gary McGraw           gem at cogsci.indiana.edu   |              (__)      |
|--------------------------------------------------|              (oo)      |
|  Center for Research on Concepts and Cognition   |       /-------\/       |
|  Department of Computer Science                  |      / |     ||        |
|  Indiana University                              |     *  ||----||        |
|                      mcgrawg at moose.indiana.edu   |        ^^    ^^        |
*---------------------------------------------------------------------------*


From mwitten at hermes.chpc.utexas.edu  Thu Feb 18 10:00:41 1993
From: mwitten at hermes.chpc.utexas.edu (mwitten@hermes.chpc.utexas.edu)
Date: Thu, 18 Feb 93 9:00:41 CST
Subject: Computational Neurosciences Workshop
Message-ID: <9302181500.AA03619@morpheus.chpc.utexas.edu>


    ***********************************************************************
    **                                                                   **
    **  UNIVERSITY OF TEXAS SYSTEM CENTER FOR HIGH PERFORMANCE COMPUTING **
    **	                                                                 **
    ** Workshop  Series  In  Computational  Medicine  And  Public  Health**
    **                                                                   **
    **  	           Announces                                     **
    **                                                                   **
    **        A Workshop On Computational Neurosciences                  **
    **									 **
    **			14-15 May 1993                                   **
    **        								 **
    **                   Austin, Texas                                   **     
    **									 **	
    ***********************************************************************


Workshop Director:
-----------------

	Dr. Matthew Witten
	Associate Director,
	University of Texas System - CHPC
	Balcones Research Center
	10100 Burnet Road, CMS 1.154
	Austin, TX 78758-4497 USA
	Phone: (512) 471-2472 or (800) 262-2472
	Fax  : (512) 471-2445
	email: m.witten at chpc.utexas.edu 
	       m.witten at uthermes.bitnet

                   ***** Peliminary Program *****

List Of Current Speakers:
-------------------------

Dr. Peter Fox, Director Research Imaging Center, UT HSC San Antonio

Dr. Terry Mikiten, Associate Dean, Grad School of Biomedical Sciences, UT HSC
		San Antonio

Dr. Robert Wyatt, Director, Institute For Theoretical Chemistry, UT Austin

Dr. Elizabeth Thomas, Department of Chemistry, UT Austin

Dr. George Adomian, Director, General Analytics Corporation, Athens, Georgia

Dr. George Moore, Department of Biomedical Engineering, University of
		Southern California, Los Angeles, CA

Dr. William Softky, California Institute of Technology, Pasadena, CA

Dr. Cathy Wu, Department of Biomathematics and Computer Science, UT Health
		Center, Tyler, TX

Dr. Dan Levine, Department of Mathematics, University of Texas at Arlington,
		Arlington, TX

Dr. Michael Liebman, Senior Scientist, Amoco Technology Company, Naperville,
		Illinois

Dr. George Stanford, Learning Abilities Center, UT Austin

Dr. Tom Oakland, School of Education, UT Austin

Dr. Matthew Witten, Associate Director, UT System - CHPC


Objective, Agenda and Participants:
----------------------------------

The 1990's have been declared the Decade of the Mind. Understanding the
mind requires the understanding of a wide variety of topics in 
the neurosciences.

This Workshop is part of an ongoing series of workshops being held at
the UT System Center For High Performance Computing; addressing issues
of high performance computing and its role in medicine, dentistry,
allied health disciplines, and public health. Prior workshops have
covered Computational Chemistry and Molecular Design, and Computational
Issues in the Life Sciences and Medicine. Upcoming workshops will
focus on the subject areas of Computational Molecular Biology and Genetics,
Biomechanics, and Physiological Modeling and Simulation. 

The purpose of this Workshop On Computational Neurosciences
is to bring together interested scientists
for the purposes of introducing them to state-of-the-art thinking and
applications in the domain of neuroscience. Topics to be discussed range
across the disciplines of neurosimulation, cognitive neuroscience, neural
nets and their theory/application to a variety of problems, methods for
solving numerical problems arising in neurology, learning
abilities and disabilities, and neurological imaging.

Lectures will be presented in a tutorial fashion, and time for questions
and answers will be allowed.

Attendence is open to anyone. A  background in the neurosciences is
not required.  The size of the workshop is limited due to
seating constraints. It is best to register as soon as possible.

Schedule:
--------

14 May 1993 - Friday

 8:00am -  9:00am 	Registration and Refreshments
 9:00am -  9:15am       Opening Remarks - Dr. James C. Almond, Director,
				UT System CHPC
 9:15am - 10:00am       Conference Overview - Dr. Matthew Witten
10:00am - 11:00am	Dr. Peter Fox
11:00am - 11:30am	Coffee Break
11:30am - 12:30pm       Dr. Dan Levine
12:30pm -  1:30pm	Lunch Break
 1:30pm -  2:30pm       Dr. Michael Liebman
 2:30pm -  3:30pm 	Dr. Cathy Wu
 3:30pm -  4:00pm	Coffee Break
 4:00pm -  5:00pm	Dr. Terry Mikiten


15 May 1993 - Saturday

 8:00am -  9:00am	Registration and Refreshments
 9:00am - 10:00am	Dr. George Moore
10:00am - 11:00am	Dr. Robert Wyatt and Dr. Elizabeth Thomas
11:00am - 11:30am	Coffee Break
11:30am - 12:30pm	Dr. George Adomian
12:30am -  1:30pm	Lunch Break
 1:30am -  2:30pm 	Dr. George Stanford and Dr. Tom Oakland
 2:30am -  3:30pm	Dr. William Softky
 3:30pm -  4:00pm       Coffee Break
 4:00pm -  5:00pm	Closing Discussion and Remarks

Poster Sessions:
----------------

While no poster sessions are planned, if enough conference 
participants indicate a desire to present a poster, we will
make every attempt to accommodate the requests. If you are interested
in presenting a poster presentation at this meeting, please contact
the workshop director.

Conference Proceedings:
----------------------

We will make every attempt to have a publication quality
conference proceedings. All of the speakers have been asked to
submit a paper covering the talk material. The proceedings will
appear as a special issue of the series Advances In Mathematics And
Computers In Medicine, which is part of the International Journal
of Computers and Mathematics With Applications (Pergamon Press).
Individuals wishing to have an appropriate paper included in this
proceedings should contact the workshop director for manuscript 
details and deadlines.


Conference Costs and Funding:
-----------------------------

A nominal registration fee of US $50.00 will be charged by 1 April 93, and
US $60.00 after that date. The conference proceedings will be an additional
US $10.00 . The conference registration fee includes luncheon and 
refreshments for both days of the workshop.


Accomodations:
-------------

There are a number of very reasonable hotels near the UT System CHPC.
Additional information may be obtained by contacting the workshop coordinator
at the address below.


Registration and Information:
----------------------------

Registration requests and further questions should be directed to:

Ms. Leslie Bockoven
Administrative Associate
Workshop On Computational NeuroSciences
UT System - CHPC
Balcones Research Center
10100 Burnet Road, CMS 1.154
Austin, TX 78758-4497
Phone: (512) 471-2472 or (800) 262-2472
Fax  : (512) 471-2445
Email: neuro93 at chpc.utexas.edu
       neuro93 at uthermes.bitnet

      ============ REGISTRATION FORM FOLLOWS - CUT HERE ==========

NAME (As will appear on badge):

AFFILIATION (As will appear on badge):

ADDRESS:


PHONE:

FAX  :

EMAIL:


Please answer the following questions as appropriate:

Do you wish to purchase a copy of the conference proceedings?
	If yes, make sure to include the proceedings purchase fee.


Do you have any special dietary requirements?
	If yes, what are they?


Do you wish to present a poster?
	If yes, what will the proposed title be?


Do you wish to include a manuscript in the conference proceedings?
	If yes, what will the proposed topic be?


Do you wish to be on our Workshop Series mailing list?
	If yes, please give the address for announcements (email is okay)


Do you need a hotel reservation?


Do you anticipate needing local transportation?


====================  END OF REGISTRATION FORM ============================


From gary at psyche.mit.edu  Wed Feb 17 18:42:21 1993
From: gary at psyche.mit.edu (Gary Marcus)
Date: Wed, 17 Feb 93 18:42:21 EST
Subject: MIT Center for Cognitive Science Occasional Paper #47
Message-ID: <9302172342.AA04329@psyche.mit.edu>

Would you please post the following announcement? Thank you very much.
Sincerely,
Gary Marcus
----
The following technical report is now available:

		   MIT CENTER FOR COGNITIVE SCIENCE
			 OCCASIONAL PAPER #47

			
	German Inflection: The Exception that Proves the Rule

			    Gary F. Marcus
				 MIT

			   Ursula Brinkmann
	      Max-Planck-Institut fuer Psycholinguistik
				   
			    Harald Clahsen
			    Richard Wiese
			    Andreas Woest
		 Universit at act[c]t D at act[y]sseldorf.
				   
			    Steven Pinker
				 MIT
				   
			       ABSTRACT


Language is often explained by generative rules and a memorized lexicon. For
example, most English verbs take a regular past tense suffix
(ask-asked), which is applied to new verbs (faxed, wugged),
suggesting the mental rule "add -d to a Verb."  Irregular verbs
(break-broke, go-went) would be listed in memory.  Connectionists
argue instead that a pattern associator memory can store and
generalize all past tense forms; irregular and regular patterns differ
only because of their different numbers of verbs. We present evidence
that mental rules are indispensible. A rule concatenates a suffix to a
symbol for verbs, so it does not require access to memorized verbs
or their sounds, but applies as the "default," whenever memory access
fails.  We find 20 such circumstances, including novel,
unusual-sounding, and derived words; in every case, people inflect
them regularly (explaining quirks like flied out, sabre-tooths,
walkmans). Contrary to connectionist accounts, these effects are not
due to regular words being in the majority. The German participle
-t and plural -s apply to minorities of words. Two experiments
eliciting ratings of novel German words show that the affixes behave like
their English counterparts, as defaults.  Thus default suffixation is
not due to numerous regular words reinforcing a pattern in associative
memory, but to a memory-independent, symbol-concatenating mental
operation. 

---------------------------------------------------------------------------
Copies of the postscript file german.ps.Z may be obtained
electronically from psyche.mit.edu as follows:

unix-1> ftp psyche.mit.edu        (or ftp 18.88.0.85)
Connected to psyche.mit.edu.
Name (psyche:): anonymous
331 Guest login ok, sent ident as password.
Password: yourname
230 Guest login ok, access restrictions apply.
ftp> cd pub
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get german.ps.Z
200 PORT command successful.
150 Opening data connection for german.ps.Z (18.88.0.154,1500) (253471 bytes).
226 Transfer complete.
local: german.ps.Z remote: german.ps.Z
166433 bytes received in 4.2 seconds (39 Kbytes/s)
ftp> quit
unix-2> uncompress german.ps.Z
unix-3> lpr -P(your_local_postscript_printer) german.ps

Or, order a hardcopy by sending your physical mail address to Eleanor
Bonsaint (bonsaint at psyche.mit.edu), asking for Occasional Paper #47,
Please do this only if you cannot use the ftp method described above.


From josh at faline.bellcore.com  Thu Feb 18 10:59:52 1993
From: josh at faline.bellcore.com (Joshua Alspector)
Date: Thu, 18 Feb 93 10:59:52 EST
Subject: Workshop on applications of neural networks to telecommunications
Message-ID: <9302181559.AA02043@faline.bellcore.com>

CALL FOR PAPERS

International Workshop on Applications of 
Neural Networks to Telecommunications

Princeton, NJ
October 18-20, 1993

You are invited to submit a paper to an international workshop on applications 
of neural networks to problems in telecommunications.
The workshop will be held in Princeton, New Jersey on October, 18-20 1993.

This workshop will bring together active researchers in neural networks with 
potential users in the telecommunications industry in a forum for discussion 
of applications issues. Applications will be identified, experiences shared,
and directions for future work explored.

Suggested Topics:
Application of Neural Networks in:

Network Management
Congestion Control
Adaptive Equalization
Speech Recognition
Security Verification
Language ID/Translation
Information Filtering
Dynamic Routing
Software Reliability
Fraud Detection
Financial and Market Prediction
Adaptive User Interfaces
Fault Identification and Prediction
Character Recognition
Adaptive Control
Data Compression

Please submit 6 copies of both a 50 word abstract and a 1000 word summary 
of your paper by May 14, 1993. Mail papers to the conference administrator:

Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com


Abstract and Summary Due: May 14
Author Notification of Acceptance: June 18
Camera-Ready Copy of Paper Due: August 13

Organizing Committee:

General Chair
Josh Alspector
Bellcore, MRE 2P-396
445 South St.
Morristown, NJ 07960-6438
(201) 829-4342
josh at bellcore.com

Program Chair
Rod Goodman
Caltech 116-81
Pasadena, CA 91125
(818) 356-3677
rogo at micro.caltech.edu

Publications Chair
Timothy X Brown
Bellcore, MRE 2E-378
445 South St.
Morristown, NJ 07960-6438
(201) 829-4314
timxb at faline.bellcore.com

Treasurer
Anthony Jayakumar, Bellcore

Events Coordinator
Larry Jackel, AT&T Bell Laboratories 

University Liaison
S Y Kung, Princeton

INNS Liaison
Bernie Widrow, Stanford University

IEEE Liaison
Steve Weinstein, Bellcore

Industry Liaisons
Miklos Boda, Ellemtel
Atul Chhabra, NYNEX
Michael Gell, British Telecom
Lee Giles, NEC
Thomas John, Southwest Bell
Adam Kowalczyk, Telecom Australia

Conference Administrator
Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

International Workshop on Applications of 
Neural Networks to Telecommunications
Princeton, NJ
October 18-20, 1993

Registration Form

Name: _____________________________________________________________

Institution: __________________________________________________________

Mailing Address:
___________________________________________________________________

___________________________________________________________________

___________________________________________________________________

___________________________________________________________________

Telephone: ______________________________

Fax: ____________________________________

E-mail: _____________________________________________________________


I will attend | | 

Send more information | |

Paper enclosed  | | 

Registration Fee Enclosed ($350) | | 
(please make sure your name is on the check)

Registration includes Monday night reception, Tuesday night banquet,
and proceedings available at the conference.

Mail to:
Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com

Deadline for submissions: May 14, 1993
Author Notification of Acceptance: June 18, 1993
Camera-Ready Copy of Paper Due: August 13, 1993


From miller at picard.ads.com  Thu Feb 18 11:51:18 1993
From: miller at picard.ads.com (Kenyon Miller)
Date: Thu, 18 Feb 93 11:51:18 EST
Subject: correction to backprop example
Message-ID: <9302181651.AA02454@picard.ads.com>


For those of you who have lost interest in the backprop debate about
replacing the sigmoid derivative with a constant, please disregard
this message.

It was recently pointed out to me that my backprop example was incomplete
(I don't know the name of the sender):

> The error need not be increased although w increased because W1-3 decreased
> and W3-4 decreased. With 2 decreases and 1 increase, one could still expect
> the N4 to decrease and also the error.
> Rgds,
> TH

My original example (with typographical corrections) was:

Consider the following example:

      ----n2-----
     /           \
w--n1             n4
     \           /
      ----n3-----

In other words, there is an output neuron n4 which is connected to two
neurons n2 and n3, each of which is connected to neuron n1, which has
a weight w.  Suppose the weight connecting n2 to n4 is negative and all
other connections in the diagram are positive.  Suppose further that
n2 is saturated and none of the other neurons are saturated.  Now,
suppose that n4 must be decreased in order to reduce the error.
Backpropagating along the n4-n2-n1 path, w receives an error term
which would tend to increase n1, while backpropagating along the
n4-n3-n1 path would result in a term which would tend to decrease n1.
If the true sigmoid derivative were used, the force to increase n1
would be dampened because n2 is saturated, and the net result would
be to decrease w and therefore decrease n1, n3, n4, and the error.
However, replacing the sigmoid derivative with a constant could easily
allow the n4-n2-n1 path to dominate, and the error at the output would
increase. 

The conclusion was that replacing the sigmoid derivative with a constant
can result in increasing the error, and is therefore undesireable.


CORRECTION TO THE EXAMPLE:


The original example did not take into account the perturbation on
W1-3 and W3-4, but the argument still holds with the following modification.
Whatever the perturbation on W1-3 and W3-4, there exists (or at least
a situation can be constructed such that there exists) some positive 
perturbation on w which will counteract those perturbations and result in an 
increase in the output error.  Now replicate the n1-n2-n4 path as necessary
by adding an n1-n5-n4 path, an n1-n6-n4 path etc.  Each new path 
results in incrementing w by some constant delta, so there must exist
some number of paths which results in a sufficient increase in w
to cause an increase in the output error of the network.  Thus, an
example can be constructed in which the error increases, so the method
cannot be considered theoretically sound.


However, you can get virtually all of the benefit without any of the 
theoretical problems by using the derivative of the piecewise-linear function

               -------------------
              /
            /
          /
---------

which involves using a constant or zero for the derivative, depending 
on a simple range test.

  -Ken Miller.


From georgiou at silicon.csci.csusb.edu  Thu Feb 18 13:04:55 1993
From: georgiou at silicon.csci.csusb.edu (George M. Georgiou)
Date: Thu, 18 Feb 1993 10:04:55 -0800
Subject: Multivalued and Continuous Perceptrons (Preprint)
Message-ID: <9302181804.AA24680@silicon.csci.csusb.edu>

Rosenblatt's Percepceptron Theorem guaranties us that a linearly
separable function (R^n --> {0,1}) can be learned in finite time.  

Question: Is it possible to guarantee learning of a continuous-valued
          function (R^n --> (0,1)) which can be represented on a
          perceptron in finite time? 

This paper answers this question (and other ones too) in the
affirmative:

	      The Multivalued and Continuous Perceptrons
				  by
			  George M. Georgiou

  Rosenblatt's perceptron is extended to (1) a multivalued
  perceptron and (2) to a continuous-valued perceptron.  It shown that
  any function that can be represented by the multivalued perceptron
  can be learned in a finite number of steps, and any function that
  can be represented by the continuous perceptron can be learned with
  arbitrary accuracy in a finite number of steps.  The whole apparatus
  is defined in the complex domain. With these perceptrons
  learnability is extended to more complicated functions than the
  usual linearly separable ones. The complex domain promises to
  be a fertile ground for neural networks research.

The file in the neuroprose is georgiou.perceptrons.ps.Z .


Comments and questions on the proofs are welcome.
---------------------------------------------------------------------
Sample session to get the file:

  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get georgiou.perceptrons.ps.Z
  ftp> quit
  unix> uncompress georgiou.perceptrons.ps.Z
  unix> lpr georgiou.perceptrons.ps (or however you print PostScript files)

 Thanks to Jordan Pollack for providing this service!

--George
----------------------------------------------------
Dr. George M. Georgiou                    E-mail: georgiou at wiley.csusb.edu
Computer Science Department                  TEL: (909) 880-5332
California State University	             FAX: (909) 880-7004
5500 University Pkwy
San Bernardino, CA 92407, USA


From rangarajan-anand at CS.YALE.EDU  Thu Feb 18 13:18:40 1993
From: rangarajan-anand at CS.YALE.EDU (Anand Rangarajan)
Date: Thu, 18 Feb 1993 13:18:40 -0500
Subject: No subject
Message-ID: <199302181818.AA24890@COMPOSITION.SYSTEMSZ.CS.YALE.EDU>

			Programmer/Analyst Position
			in Artificial Neural Networks

			The Yale Center for Theoretical
			and Applied Neuroscience (CTAN)
                                 and the
			Department of Computer Science
			Yale University, New Haven, CT
			
We are offering a challenging position in software engineering in support of 
new techniques in image processing and computer vision using artificial neural
networks (ANNs).

1. Basic Function:
Designer and programmer for computer vision and neural network
software at CTAN and the Computer Science department.

2. Major duties:
(a) To implement computer vision algorithms using a Khoros (or similar) 
type of environment.

(b) Use the aforementioned tools and environment to run and analyze 
computer experiments in specific image processing and vision application 
areas.

(c) To facilitate the improvement of neural network algorithms and 
architectures for vision and image processing.

3. Position Specifications:
(a) Education: 
	BA, including linear algebra, differential equations, calculus.
	helpful: mathematical optimization.

(b) Experience:
	programming experience in C++ (or C) under UNIX.
	some of the following: neural networks,	vision or image processing 
	applications, scientific computing, workstation graphics,
	image processing environments, parallel computing, computer algebra
	and object-oriented design.

Preferred starting date: March 1, 1993.

For information or to submit an application, please write:

Eric Mjolsness
Department of Computer Science
Yale University
P. O. Box 2158 Yale Station
New Haven, CT 06520-2158
e-mail: mjolsness-eric at cs.yale.edu
			
Any application must also be submitted to:

Jeffrey Drexler
Department of Human Resources
Yale University
155 Whitney Ave.
New Haven, CT 06520

-Eric Mjolsness and Anand Rangarajan
 (prospective supervisors)


From pjs at bvd.Jpl.Nasa.Gov  Thu Feb 18 14:49:36 1993
From: pjs at bvd.Jpl.Nasa.Gov (Padhraic Smyth)
Date: Thu, 18 Feb 93 11:49:36 PST
Subject: Position Available at JPL
Message-ID: <9302181949.AA26236@bvd.jpl.nasa.gov>


 We currently have an opening in our group for a new PhD graduate
 in the general area of signal processing and pattern recognition.
 While the job description does not mention neural computation per
 se, it may be of interest to some members of the connectionist
 mailing list. For details see below.

 Padhraic Smyth, JPL


                     RESEARCH POSITION AVAILABLE
                              AT THE
                      JET PROPULSION LABORATORY,
                 CALIFORNIA INSTITUTE OF TECHNOLOGY


 The Communications Systems Research Section at JPL has an immediate
 opening for a permanent member of technical staff in the area of
 adaptive signal processing and statistical pattern recognition.

 The position requires a PhD in Electrical Engineering or a closely
 related field and applicants should have a demonstrated ability
 to perform independent research.

 A background in statistical signal processing is highly desirable.
 Background in information theory, estimation and detection, advanced
 statistical methods, and pattern recognition, would also be a plus.

 Current projects within the group include the use of hidden Markov
 models for change detection in time series, and statistical methods
 for geologic feature detection in remotely sensed image data. The
 successful applicant will be expected to perform both basic and
 applied research and to propose and initiate new research projects.

 Permanent residency or U.S. citizenship is not a strict requirement
 - however, candidates not in either of these categories should be
 aware that their applications will only be considered in
 exceptional cases.

 Interested applicants  should send their resume (plus any supporting
 background material such as recent relevant papers) to:

 Dr. Stephen Townes
 JPL 238-420
 4800 Oak Grove Drive
 Pasadena, CA 91109.

 (email: townes at bvd.jpl.nasa.gov)


From mpp at cns.brown.edu  Thu Feb 18 15:42:34 1993
From: mpp at cns.brown.edu (Michael P. Perrone)
Date: Thu, 18 Feb 93 15:42:34 EST
Subject: A computationally efficient squashing function
Message-ID: <9302182042.AA03424@cns.brown.edu>

Recently on the comp.ai.neural-nets bboard, there has been a discussion of
more computationally efficient squashing functions.  Some colleagues of
mine suggested that many members of the Connectionist mailing list may not
have access to the comp.ai.neural-nets bboard; so I have included a summary
below.

Michael

------------------------------------------------------ 
David L. Elliot mentioned using the following neuron activation function:

                      	              x
                            f(x) = -------
                                   1 + |x|

He argues that this function has the same qualitative properties of the
hyperbolic tangent function but in practice faster to calculate.

I have suggested a similar speed-up for radial basis function networks:

                      	              1
                            f(x) = -------
                                   1 + x^2

which avoids the transcendental calculation associated with gaussian RBF
nets.

I have run simulations using the above squashing function in various
backprop networks.  The performance is comparable (sometimes worse
sometimes better) to usual training using hyperbolic tangents.  I also
found that the performance of networks varied very little when the
activation functions were switched (i.e. two networks with identical
weights but different activation functions will have comparable performance
on the same data).  I tested these results on two databases: the NIST OCR
database (preprocessed by Nestor Inc.) and the Turk and Pentland human face
database.

--------------------------------------------------------------------------------
Michael P. Perrone                                      Email: mpp at cns.brown.edu
Institute for Brain and Neural Systems                  Tel:   401-863-3920
Brown University                                        Fax:   401-863-3934
Providence, RI 02912


From henrik at robots.ox.ac.uk  Fri Feb 19 11:47:16 1993
From: henrik at robots.ox.ac.uk (henrik@robots.ox.ac.uk)
Date: Fri, 19 Feb 93 16:47:16 GMT
Subject: Squashing functions
Message-ID: <9302191647.AA05729@cato.robots.ox.ac.uk>


Any interesting squashing function can be stored in a table of negligible size
(eg 256) with very high accuracy if linear (or higher) interpolation is used.
So, on a RISC workstation, there is no need for improvements. If you deal with
analog VLSI, anything goes, though ...
Cheers, henrik at robots.ox.ac.uk


From cateau at tkyux.phys.s.u-tokyo.ac.jp  Sat Feb 20 01:11:11 1993
From: cateau at tkyux.phys.s.u-tokyo.ac.jp (Hideyuki Cateau)
Date: Sat, 20 Feb 93 15:11:11 +0900
Subject: TR:Univeral Power law
Message-ID: <9302200611.AA21000@tkyux.phys.s.u-tokyo.ac.jp>


  I and my collaborators previously reported that there is a beautiful 
power law in the pace of the memory of Back Prop.  We found a reaction 
from one of networkers that the law was established only in the special model.

 This time we performed an extensive simulation to show the law is fairly 
universal in the technical report:cateau.univ.tar.Z,

        Universal Power law in feed forward networks


                       H.Cateau

                   Department of Physics
                   University of Tokyo
 

Abstract:

The power law in the pace of the memory, which was previously 
reported for the encoder, is shown to hold  
universally for general feed forward networks.    
An extensive simulation on wide variety of  feed forward networks  
shows this and reveals a lot of interesting new observations.  


The PostScript for this paper may be retrieved in the usual fashion:


  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get cateau.univ.tar.Z
  ftp> quit
  unix> uncompress cateau.univ.tar.Z
  unix> tar xvfo  cateau.univ.tar
      Then you get three PS files:short.ps fig1.ps fig2.ps
  unix> lpr short.ps
  unix> lpr fig1.ps
  unix> lpr fig2.ps


Hideyuki Cateau
Particle theory group, Department of Physics,University of Tokyo,7-3-1,
Hongo,Bunkyoku,113 Japan
e-mail:cateau at tkyux.phys.s.u-tokyo.ac.jp


From soller at asylum.cs.utah.edu  Fri Feb 19 16:09:43 1993
From: soller at asylum.cs.utah.edu (Jerome Soller)
Date: Fri, 19 Feb 93 14:09:43 -0700
Subject: Industrial Position in Artificial Intelligence and/or Neural Networks
Message-ID: <9302192109.AA22408@asylum.cs.utah.edu>


	I have just been made aware of a job opening in artificial
intelligence and/or neural networks in southeast Ogden, UT.  This
company maintains strong technical interaction with existing industrial,
U.S. government laboratory, and university strengths in Utah.  Ogden
is a half hour to 45 minute drive from Salt Lake City, UT.
For further information, contact Dale Sanders at 801-625-8343  or
dsanders at bmd.trw.com .  The full job description is listed below.
					Sincerely,

				Jerome Soller
				U. of Utah Department of Computer Science
			and 	VA Geriatric, Research, Education and 
					Clinical Center

Knowledge engineering and expert systems development.  Requires
five years formal software development experience, including two years
expert systems development.  Requires experience implementing
at least one working expert system.  Requires familiarity with expert
systems development tools and DoD specification practices.  Experience with
neural nets or fuzzy logic systems may qualify as equivalent experience 
to expert systems development.  Familiarity with Ada, C/C++, database design, 
and probabilistic risk assessment strongly desired.  Requires strong 
communication and customer interface skills.  Minimum degree:  BS in 
computer science, engineering, math, or physical science.  M.S. or Ph.D.
preferred.  U.S. Citizenship is required.  Relocation funding is limited.  


From delliott at eng.umd.edu  Fri Feb 19 15:22:38 1993
From: delliott at eng.umd.edu (David L. Elliott)
Date: Fri, 19 Feb 1993 15:22:38 -0500
Subject: Abstract
Message-ID: <199302192022.AA03327@verdi.eng.umd.edu>

			      ABSTRACT

      A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS

    TR 93-8, Institute for Systems Research, University of Maryland

  by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
                          January 29, 1993
The activation function  s(x) = x/(1 + |x|) is proposed for use in
digital simulation of neural networks, on the grounds that the 
computational operation count for this function is much smaller than
for those using exponentials and that it satisfies the simple differential
equation  s' = (1 + |s|)^2,  which generalizes the logistic equation.
The full report, a work-in-progress, is available in LaTeX or PostScript
form (two pages + titlepage) by request to delliott at src.umd.edu.


From tony at aivru.shef.ac.uk  Fri Feb 19 05:59:46 1993
From: tony at aivru.shef.ac.uk (Tony_Prescott)
Date: Fri, 19 Feb 93 10:59:46 GMT
Subject: lectureship
Message-ID: <9302191059.AA23937@aivru>


		LECTURESHIP IN COGNITIVE SCIENCE
		  University of Sheffield, UK.

Applications are invited for the above post tenable from 1st October 1993
for three years in the first instance but with expectation of renewal.
Preference will be given to candidates with a PhD in Cognitive Science,
Artificial Intelligence, Cognitive Psychology, Computer Science, Robotics,
or related disciplines.

The Cognitive Science degree is an integrated course taught by the departments
of Psychology and Computer Science. Research in Cognitive Science was highly
evaluated in the recent UFC research evaluation exercise, special areas of interest being vision, speech, language, neural networks, and learning. The
successful candidate will be expected to undertake research vigorously.
Supervision of programming projects will be required, hence considerable 
experience with Lisp, Prolog, and/or C is essential.

It is expected that the appointment will be made on the Lecturer A scale
(13,400-18,576 pounds(uk) p.a.) according to age and experience but enquiries
from more experienced staff able to bring research resources are welcomed.

Informal enquiries to Professor John P Frisby 044-(0)742-826538 or e-mail
jpf at aivru.sheffield.ac.uk.  Further particulars from the director of Personnel
Services, The University, Sheffield S10 2TN, UK, to whom all applications
including a cv and the names and addresses of three referees (6 copies of all
documents) should be sent by 1 April 1993.

Short-listed candidates will be invited to Sheffield for interview for which
travel expenses (within the UK only) will be funded.

Current permanent research staff in Cognitive Science at Sheffield include: 
	Prof John Frisby (visual psychophysics),
	Prof John Mayhew (computer vision, robotics, neural networks)
	Prof Yorik Wilks (natural language understanding)
	Dr Phil Green (speech recognition)
	Dr John Porrill (computer vision)
	Dr Paul McKevitt (natural language understanding)
	Dr Peter Scott (computer assisted learning)
	Dr Rod Nicolson (human learning)
	Dr Paul Dean (neuroscience, neural networks)
	Mr Tony Prescott (neural networks, comparative cog sci)


From delliott at src.umd.edu  Sat Feb 20 15:23:57 1993
From: delliott at src.umd.edu (David L. Elliott)
Date: Sat, 20 Feb 1993 15:23:57 -0500
Subject: Corrected Abstract
Message-ID: <199302202023.AA12407@newra.src.umd.edu>


			      ABSTRACT [corrected]

      A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS

    TR 93-8, Institute for Systems Research, University of Maryland

  by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
                          January 29, 1993
The activation function  s(x) = x/(1 + |x|) is proposed for use in
digital simulation of neural networks, on the grounds that the 
computational operation count for this function is much smaller than
for those using exponentials and that it satisfies the simple differential
equation  s' = (1 - |s|)^2,  which generalizes the logistic equation.
The full report, a work-in-progress, is available in LaTeX or PostScript
form (two pages + titlepage) by request to delliott at src.umd.edu.


Thanks to Michael Perrone for calling my attention to the typo in s'.


From raina at max.ee.lsu.edu  Sat Feb 20 17:37:45 1993
From: raina at max.ee.lsu.edu (Praveen Raina)
Date: Sat, 20 Feb 93 16:37:45 CST
Subject: No subject
Message-ID: <9302202237.AA13139@max.ee.lsu.edu>

The following comparison between the backpropagation and
Kak algorithm for training feedforward networks will be
of interest to many.

We took 52 training samples each having 25 input neurons
and 3 output neurons.The training data taken was monthly
price index of a commodity for 60 months. Monthly prices 
were normalised and quantized into 3 bit binary sequence.
Each training sample represented prices taken over a period
of 8 months (8X3=24 input neurons + 1 neuron for bias).The
size of the learning window was fixed as 1 month.Binary 
values were used as the input for both BP and Kak algorithm.
For BP the learning rate was taken as 0.45 and momentum
equal to 0.55.

The training samples were trained on IBM RISC 6000 machine.
The training time for backpropagation was 4 minutes 5 seconds
and the total number of iterations was 6101.The training time
for the Kak algorithm was 5 seconds and the total number of
iterations was 875. Thus, for this example the learning advantage
in the Kak algorithm is 49. For larger examples the advantage
becomes even greater.
- Praveen Raina.
 

From unni at neuro.cs.gmr.com  Sat Feb 20 14:57:13 1993
From: unni at neuro.cs.gmr.com (K.P.Unnikrishnan)
Date: Sat, 20 Feb 93 14:57:13 EST
Subject: A NEURAL COMPUTATION course reading list
Message-ID: <9302201957.AA22392@neuro.cs.gmr.com>

Folks:
	Here is the reading list for a course I offered last semester at Univ. of
Michigan. 

Unnikrishnan
---------------------------------------------------------------

READING LIST FOR THE COURSE "NEURAL COMPUTATION"
EECS-598-6 (FALL 1992), UNIVERSITY OF MICHIGAN

INSTRUCTOR: K. P. UNNIKRISHNAN
-----------------------------------------------

A. COMPUTATION AND CODING IN THE NERVOUS SYSTEM

1. Hodgkin, A.L., and Huxley, A.F. A quantitative description of membrane
current and its application to conduction and excitation in nerve. J. Physiol.
117, 500-544 (1952).

2a. Del Castillo, J., and Katz, B. Quantal components of the end-plate
potential. J. Physiol. 124, 560-573 (1954).

2b. Del Castillo, J., and Katz, B. Statistical factors involved in neuromuscular
facilitation and depression. J. Physiol. 124, 574-585 (1954).

3. Rall, W. Cable theory for dendritic neurons. In: Methods in neural
modeling (Koch and Segev, eds.) pp. 9-62 (1989).

4. Koch, C., and Poggio, T. Biophysics of computation: neurons, synapses and
membranes. In: Synaptic function (Edelman, Gall, and Cowan, eds.) pp.
637-698 (1987).

B. SENSORY PROCESSING IN VISUAL AND AUDITORY SYSTEMS

1. Werblin, F.S., and Dowling, J.E. Organization of the retina of the mudpuppy,
Necturus maculosus: II. Intracellular recording. J. Neurophysiol. 32, 339-355
(1969).

2a. Barlow H.B., and Levick, W.R. The mechanism of directionally selective
units in rabbit's retina. J. Physiol. 178, 477-504 (1965).

2b. Lettvin, J.Y., Maturana, H.R., McCulloch, W.S., and Pitts, W.H. What the
frog's eye tells the frogs's brain. Proc. IRE 47, 1940-1951 (1959).

3. Hubel, D.H., and Wiesel, T.N. Receptive fields, binocular interaction and
functional architecture in the cat's visual cortex. J. Physiol. 160, 106-154
(1962).

4a. Suga, N. Cortical computational maps for auditory imaging. Neural Networks,
3, 3-21 (1990).

4b. Simmons, J.A. A view of the world through the bat's ear: the formation
of acoustic images in echolocation. Cognition, 33 155-199 (1989).


C. MODELS OF SENSORY SYSTEMS

1. Hect,S., Shlaer, S., and Pirenne, M.H. Energy, quanta, and vision. J. Gen.
Physiol. 25, 819-840 (1942).

2. Julesz, B., and Bergen, J.R. Textons, the fundamental elements in 
preattentive vision and perception of textures. Bell Sys. Tech. J. 62, 
1619-1645 (1983).

3a. Harth, E., Unnikrishnan, K.P., and Pandya, A.S. The inversion of sensory
processing by feedback pathways: a model of visual cognitive functions. 
science 237, 184-187 (1987).

3b. Harth, E., Pandya, A.S., and Unnikrishnan, K.P. Optimization of cortical
responses by feedback modification and synthesis of sensory afferents. A model
of perception and rem sleep. Concepts Neurosci. 1, 53-68 (1990).

3c. Koch, C. The action of the corticofugal pathway on sensory thalamic
nuclei: A hypothesis. Neurosci. 23, 399-406 (1987).

4a. Singer, W. et al., Formation of cortical cell assemblies. In: CSH Symposia 
on Quant. Biol. 55, pp. 939-952 (1990). 

4b. Eckhorn, R., Reitboeck, H.J., Arndt, M., and Dicke, P. Feature linking via
synchronization among distributed assemblies: Simulations of results from 
cat visual cortex. Neural Comp. 293-307 (1990). 

5. Reichardt, W., and Poggio, T. Visual control of orientation behavior in
the fly. Part I. A quantitative analysis. Q. Rev. Biophys. 9, 311-375 (1976).


D. ARTIFICIAL NEURAL NETWORKS

1a. Block, H.D. The perceptron: a model for brain functioning. Rev. Mod. Phy.
34, 123-135 (1962).

1b. Minsky, M.L., and Papert, S.A. Perceptrons. pp. 62-68 (1988).

2a. Hornik, K., Stinchcombe, M., and White, H. Multilayer feedforward 
networks are universal approximators. Neural Networks 2, 359-366 (1989).

2b. Lapedes, A., and Farber, R. How neural nets work. In: Neural Info. Proc.
Sys. (Anderson, ed.) pp. 442-456 (1987).

3a. Ackley, D.H., Hinton, G.E., and Sejnowski, T.J. A learning algorithm for
boltzmann machines. Cog. Sci. 9, 147-169 (1985).

3b. Hopfield, J.J. Learning algorithms and probability distributions in
feed-forward and feed-back networks. PNAS, USA. 84, 8429-8433 (1987).

4. Tank, D.W., and Hopfield, J.J. Simple neural optimization networks:
An A/D converter, signal decision circuit, and linear programming circuit.
IEEE Tr. Cir. Sys. 33, 533-541 (1986).

E. NEURAL NETWOK APPLICATIONS

1. LeCun, Y., et al., Backpropagation applied to handwritten zip code 
recognition. Neural Comp. 1, 541-551 (1990).

2. Lapedes, A., and Farber, R. Nonlinear signal processing using neural
networks. LA-UR-87-2662, Los Alamos Natl. Lab. (1987).

3. Unnikrishnan, K.P., Hopfield, J.J., and Tank, D.W. Connected-digit 
speaker-dependent speech recognition using a neural network with time-delayed
connections. IEEE Tr. ASSP. 39, 698-713 (1991).

4a. De Vries, B., and Principe, J.C. The gamma model - a new neural model for
temporal processing. Neural Networks 5, 565-576 (1992).

4b. Poddar, P., and Unnikrishnan, K.P. Memory neuron networks: a prolegomenon.
GMR-7493, GM Res. Labs. (1991).

5. Narendra, K.S., and Parthasarathy, K. Gradient methods for the optimization
of dynamical systems containing neural networks. IEEE Tr. NN 2, 252-262 (1991).


F. HARDWARE IMPLEMENTATIONS

1a. Mahowald, M.A., and Mead, C. Silicon retina. In: Analog VLSI and neural
systems (Mead). pp. 257-278 (1989).

1b. Mahowald, M.A., and Douglas, R. A silicon neuron. Nature 354, 515-518 
(1991).

2. Mueller, P. et al. Design and fabrication of VLSI components for a 
general purpose analog computer. In: Proc. IEEE workshop VLSI neural sys.
(Mead, ed.) pp. xx-xx (1989).

3. Graf, H.P., Jackel, L.D., and Hubbard, W.E. VLSI implementation of
a neural network model. Computer 2, 41-49 (1988).


G. ISSUES ON LEARNING

1. Geman, S., Bienenstock, E., and Doursat, R. Neural networks and the
bias/variance dilema. Neural Comp. 4, 1-58 (1992).

2. Brown, T.H., Kairiss, E.W., and Keenan, C.L. Hebbian synapses: Biophysical
mechanisms and algorithms. Ann. Rev. Neurosci. 13, 475-511 (1990).

3. Haussler, D. Quantifying inductive bias: AI learning algorithms and 
valiant's learning framework. AI 36, 177-221 (1988).

4. Reeke, G.N. Jr., and Edelman, G.M. Real brains and artificial intelligence.
Daedalus 117, 143-173 (1988). 

5. White, H. Learning in artificial neural networks: a statistical
perspective. Neural Comp. 1, 425-464 (1989).

----------------------------------------------------------------------
SUPPLEMENTAL READING

Nehr, E., and Sakmann, B. Single channel currents recorded from membrane 
of denervated frog muscle fibers. Nature 260, 779-781 (1976).

Rall, W. Core conductor theory and cable properties of neurons. In: Handbook
Physiol. (Brrokhart, Mountcastle, and Kandel eds.) pp. 39-97 (1977).

Shepherd, G.M., and Koch, C. Introduction to synaptic circuits. In: The 
synaptic organization of the brain (Shepherd, ed.) pp. 3-31 (1990).

Junge, D. Synaptic transmission. In: nerve and muscle excitation (Junge)
pp. 149-178 (1981).

Scott, A.C. The electrophysics of a nerve fiber. Rev. Mod. Phy. 47, 487-533
(1975).

Enroth-Cugell, C., and Robson, J.G. The contrast sensitivity of retinal
ganglion cells of the cat. J. Physiol. 187, 517-552 (1966).

Felleman, D.J., and Van Essen, D.C. Distributed hierarchical processing in the
primate cerebral cortex. Cerebral Cortex, 1, 1-47 (1991).

Julesz, B. Early vision and focal attention. Rev. Mod. Phy.63, 735-772 (1991).

Sejnowski, T.J., Koch, C., and Churchland, P.S. Computational neuroscience.
Science 241, 1299-1302 (1988).

Churchland, P.S., and Sejnowski, T.J. Perspectives on Cognitive Neuroscience.
Science 242, 741-745 (1988).

McCulloch, W.S., and Pitts, W. A logical calculus of ideas immanent in
nervous activity. Bull. Math. Biophy. 5, 115-133 (1943).

Hopfield, J.J. Neural networks and physical systems with emergent
collective computational abilities. PNAS, USA. 79, 2554-2558 (1982).
 
Hopfield, J.J. Neurons with graded responses have collective computational
properties like those of two-state neurons. PNAS, USA. 81, 3088-3092 (1984).
 
Hinton, G.E., and Sejnowski, T.J. Optimal perceptual inference. Proc. IEEE
CVPR. 448-453 (1983).

Rumelhart, D.E., Hinton, G.E., and Williams, R.J. Learning representations
by back-propagating errors. Nature 323, 533-536 (1986).
 
Unnikrishnan, K.P., and Venugopal, K.P. Learning in connectionist networks
using the Alopex algorithm. Proc. IEEE IJCNN. I-926 - I-931 (1992).
 
Cowan, J.D., and Sharp, D.H. Neural nets. Quart. Rev. Biophys. 21, 365-427
(1988).

Lippmann, R.P. An introduction to computing with neural nets. IEEE ASSP
Mag. 4, 4-22 (1987).

Sompolinsky, H. Statistical mechanics of neural networks. Phy. Today 41, 70-80
(1988).

Hinton, G.E. Connectionist learning procedures. Art. Intel. 40, 185-234 (1989).


From demers at cs.ucsd.edu  Sun Feb 21 13:45:24 1993
From: demers at cs.ucsd.edu (David DeMers)
Date: Sun, 21 Feb 93 10:45:24 -0800
Subject: NIPS-5 papers: Nonlinear dimensionallity reduction / Inverse kinematics
Message-ID: <9302211845.AA24988@beowulf>


Non-Linear Dimensionality Reduction

David DeMers & Garrison Cottrell

ABSTRACT
--------
A method for creating a non--linear encoder--decoder for 
multidimensional data with compact representations is presented.  
The commonly used technique of autoassociation is extended to 
allow non--linear representations, and an objective function which
penalizes activations of individual hidden units is
shown to result in minimum dimensional encodings with
respect to allowable error in reconstruction.


============================================================

Global Regularization of Inverse Kinematics for Redundant Manipulators

David DeMers & Kenneth Kreutz-Delgado

ABSTRACT
--------
The inverse kinematics problem for redundant manipulators is
ill--posed and nonlinear.  There are two fundamentally different 
issues which result in the need for some form of regularization;
the existence of multiple solution branches (global ill--posedness) 
and the existence of excess degrees of freedom (local ill--posedness).
For certain classes of manipulators, learning methods applied to 
input--output data generated from the forward function can be used to
globally regularize the problem by partitioning the domain of the 
forward mapping into a finite set of regions over which the inverse 
problem is well--posed.  Local regularization can be accomplished
by an appropriate parameterization of the redundancy consistently over  
each region.  As a result, the ill--posed problem can be transformed 
into a finite set of well--posed problems.  Each can then be
solved separately to construct approximate direct inverse functions.

=============================================================

Preprints are available from the neuroprose archive

Retrievable in the usual way:

unix> ftp archive.cis.ohio-state.edu (128.146.8.52)
login as "anonymous", password = <your-email-address>
ftp> cd pub/neuroprose
ftp> binary
ftp> get demers.nips92-nldr.ps.Z
ftp> get demers.nips92-robot.ps.Z
ftp> bye
unix> uncompress demers.*.ps.Z 
unix> lpr -s demers.nips92-nldr.ps.Z
unix> lpr -s demers.nips92-robot.ps.Z

(or however you print *LARGE* PostScript files)


These papers will appear in
S.J. Hanson, J.E. Moody & C.L. Giles, eds,
Advances in Neural Information Processing Systems 5
(Morgan Kaufmann, 1993).


Dave DeMers			 	        demers at cs.ucsd.edu
Computer Science & Engineering	0114		demers%cs at ucsd.bitnet
UC San Diego					...!ucsd!cs!demers
La Jolla, CA 92093-0114	(619) 534-0688, or -8187, FAX: (619) 534-7029


From srikanth at rex.cs.tulane.edu  Sun Feb 21 14:41:45 1993
From: srikanth at rex.cs.tulane.edu (R. Srikanth)
Date: Sun, 21 Feb 93 13:41:45 CST
Subject: Abstract, New Squashing function...
In-Reply-To: <199302192022.AA03327@verdi.eng.umd.edu>; from "David L. Elliott" at Feb 19, 93 3:22 pm
Message-ID: <9302211941.AA17332@hercules.cs.tulane.edu>

>
>                             ABSTRACT
>
>       A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS
>
>     TR 93-8, Institute for Systems Research, University of Maryland
>
>   by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
>                           January 29, 1993
> The activation function  s(x) = x/(1 + |x|) is proposed for use in
> digital simulation of neural networks, on the grounds that the
> computational operation count for this function is much smaller than
> for those using exponentials and that it satisfies the simple differential
> equation  s' = (1 + |s|)^2,  which generalizes the logistic equation.
> The full report, a work-in-progress, is available in LaTeX or PostScript
> form (two pages + titlepage) by request to delliott at src.umd.edu.
>
>

This squashing function while not widely in use, is and has been used by
few others. George Georgiou uses it for a complex back propagation network.
Not only does the activation function enable him to model a complex BP but
also seems to lend itself to easier implementation.

For more information on complex domain backprop, contact
Dr. George Georgiou at  georgiou at meridian.csci.csusb.edu


--


srikanth at cs.tulane.edu
Dept of Computer Science,
Tulane University,
New Orleans, La - 70118


From delliott at src.umd.edu  Sun Feb 21 15:00:03 1993
From: delliott at src.umd.edu (David L. Elliott)
Date: Sun, 21 Feb 1993 15:00:03 -0500
Subject: Response
Message-ID: <199302212000.AA17583@newra.src.umd.edu>

Henrik-

Thanks for your comment; you wrote:
"Any interesting squashing function can be stored in a table of negligible size
(eg 256) with very high accuracy if linear (or higher) interpolation is used."

I think you are right *if the domain of the map
is compact* a priori. Otherwise the approximation must eventually become
constant for large x, and this has bad consequences for backpropagation
algorithms. For some other training methods, perhaps not.

David
 

From gluck at pavlov.rutgers.edu  Mon Feb 22 08:05:05 1993
From: gluck at pavlov.rutgers.edu (Mark Gluck)
Date: Mon, 22 Feb 93 08:05:05 EST
Subject: Neural Computation & Cognition: Opening for NN Programmer
Message-ID: <9302221305.AA04474@james.rutgers.edu>


       POSITION AVAILABLE: NEURAL-NETWORK RESEARCH PROGRAMMER

At the Center for Neuroscience at Rutgers-Newark, we have an opening
for a full or part-time research programmer to assist in developing
neural-network simulations. The research involves integrated
experimental and theoretical analyses of the cognitive and neural bases
of learning and memory. The focus of this research is on understanding
the underlying neurobiological mechanisms for complex learning
behaviors in both animals and humans.

Substantial prior experience and understanding of neural-network
theories and algorithms is required. Applicants should have a high
level of programming experience (C or Pascal), and familiarity with
Macintosh and/or UNIX. Strong English-language communication and
writing skills are essential.

*** This position would be particularly appropriate for a graduating
college senior who seeks "hands-on" research experience prior to
graduate school in the cognitive, neural, or computational sciences ***

Applications are being accepted now for an immediate start-date or for
starting in June or September of this year. NOTE TO N. CALIF.
APPLICANTS:  Interviews for applicants from the San Francisco/Silicon
Valley area will be conducted at Stanford in late March. The
Neuroscience Center is located 20 minutes outside of New York City in
northern New Jersey.

For further information, please send an email or hard-copy letter
describe your relevant background, experience, and career goals to:

______________________________________________________________________

Dr. Mark A. Gluck
Center for Molecular & Behavioral Neuroscience
Rutgers University
197 University Ave.
Newark, New Jersey  07102

        Phone:  (201) 648-1080 (Ext. 3221)
        Fax:    (201) 648-1272
        Email:  gluck at pavlov.rutgers.edu 


From peleg at cs.huji.ac.il  Tue Feb 23 15:38:02 1993
From: peleg at cs.huji.ac.il (Shmuel Peleg)
Date: Tue, 23 Feb 93 22:38:02 +0200
Subject: CFP: 12-ICPR, Int Conf Pattern Recognition, Jerusalem, 1994
Message-ID: <9302232038.AA28915@humus.cs.huji.ac.il>

===============================================================================
                       CALL FOR PAPERS - 12th ICPR
              International Conferences on Pattern Recognition
                     Oct 9-13, 1994, Jerusalem, Israel

The 12th ICPR of the International Association for Pattern Recognition will be
organized as a set of four conferences, each dealing with a special topic. The
program for each individual conference will be organized by its own Program 
Committee. Papers describing applications are encouraged, and will be reviewed
by a special Applications Committee. An award will be given for the best 
industry-related paper presented at the conference. Considerations for this 
award will include innovative applications, robust performance, and 
contributions to industrial progress. An exhibition will also be held.
The conference proceedings are published by the IEEE Computer Society Press.

GENERAL CO-CHAIRS:   S. Ullman - Weizmann Inst. (shimon at wisdom.weizmann.ac.il)
                     S. Peleg - The Hebrew University (peleg at cs.huji.ac.il)
LOCAL ARRANGEMENTS:  Y. Yeshurun - Tel-Aviv University (hezy at math.tau.ac.il)
INDUSTRIAL & APPLICATIONS LIAISON: M. Ejiri - Hitachi (ejiri at crl.hitachi.co.jp)

                          CONFERENCE DESCRIPTIONS

1. COMPUTER VISION AND IMAGE PROCESSING, T. Huang - University of Illinois
   Early vision and segmentation; image representation; shape and texture
   analysis; motion and stereo; range imaging and remote sensing; color;
   3D representation and recognition.

2. PATTERN RECOGNITION AND NEURAL NETWORKS, N. Tishby - The Hebrew University
   Statistical, syntactic, and hybrid pattern recognition techniques; neural
   networks for associative memory, classification, and temporal processing;
   biologically oriented neural networks models; biomedical applications.

3. SIGNAL PROCESSING, D. Malah - Technion, Israel Institute of Technology
   Analysis, representation, coding, and recognition of signals; signal and 
   image enhancement and restoration; scale-space and joint time-frequency
   analysis and representation; speech coding and recognition; image and video
   coding; auditory scene analysis.

4. PARALLEL COMPUTING, S. Tanimoto - University of Washington
   Parallel architectures and algorithms for pattern recognition, vision, and
   signal processing; special languages, programming tools, and applications of
   multiprocessor and distributed methods; design of chips, real-time hardware,
   and neural networks; recognition using multiple sensory modalities.

PAPER SUBMISSION DEADLINE: February 1, 1994.
Notification of Acceptance: May 1994. Camera-Ready Copy: June 1994.

Send four copies of paper to: 12th ICPR, c/o International, 10 Rothschild blvd,
65121 Tel Aviv, ISRAEL. Tel. +972(3)510-2538, Fax +972(3)660-604

Each manuscript should include the following:
1. A Summary Page addressing these topics:
   - To which of the four conference is the paper submitted?
   - What is the paper about? - What is the original contribution of this work?
   - Does the paper mainly describe an application, and should be reviewed by
     the applications committee?
2. The paper, limited in length to 4000 words. This is the estimated length
   of the proceedings version.
            
For further information contact the secretariat at the above address, or use
E-mail: icpr at math.tau.ac.il .
===============================================================================


From prechelt at ira.uka.de  Tue Feb 23 08:55:11 1993
From: prechelt at ira.uka.de (prechelt@ira.uka.de)
Date: Tue, 23 Feb 93 14:55:11 +0100
Subject: Squashing functions
In-Reply-To: Your message of Fri, 19 Feb 93 16:47:16 +0000. <9302191647.AA05729@cato.robots.ox.ac.uk>
Message-ID: <mailman.582.1149540257.24850.connectionists@cs.cmu.edu>


> Any interesting squashing function can be stored in a table of negligible size
> (eg 256) with very high accuracy if linear (or higher) interpolation is used.

256 points are not always negligible:
On a fine-grain massively parallel machine such as the MasPar MP-1,
the 256*4 bytes needed to store it can consume a considerable amount of 
the available memory.

Our MP-1216A has 16384 processors with only 16 kB memory each.

Another point: On this machine, I am not sure whether interpolating from
such a table would really be faster than, say, a third order Taylor approximation
of the sigmoid.

  Lutz

Lutz Prechelt   (email: prechelt at ira.uka.de)            | Whenever you 
Institut fuer Programmstrukturen und Datenorganisation  | complicate things,
Universitaet Karlsruhe;  D-7500 Karlsruhe 1;  Germany   | they get
(Voice: ++49/721/608-4068, FAX: ++49/721/694092)        | less simple.


From henrik at robots.ox.ac.uk  Tue Feb 23 13:56:11 1993
From: henrik at robots.ox.ac.uk (henrik@robots.ox.ac.uk)
Date: Tue, 23 Feb 93 18:56:11 GMT
Subject: Squashing functions (continued)
Message-ID: <9302231856.AA22594@cato.robots.ox.ac.uk>


The saturation problem ('the actviation function gets constant for large |x|')
can usually be solved by putting the derivative of the act. function into a 
table as well. You can then cheat a bit by not setting it to zero at large |x|.

Concerning memory requirements (eg, MasPar MP1). I don't see why I need 4 bytes
per table entry. According to the paper by Fahlman & Hoehfeld on limited pre-
cision, the quantization can be done with very few bits (less than 8 if tricks
are used). With interpolation you can get a pretty decent 16 bit act. value 
out of a 8bit wide table. 
Apart of that, seems to be quite complicated to put a nn on 16K processors ...
how do you do that ?
 
Cheers, henrik at robots.ox.ac.uk


From xueh at microsoft.com  Wed Feb 24 01:19:47 1993
From: xueh at microsoft.com (Xuedong Huang)
Date: Tue, 23 Feb 93 22:19:47 PST 
Subject: Microsoft Speech Research
Message-ID: <9302240620.AA07680@netmail.microsoft.com>


As you may know, I've started a new speech group here at Microsoft. For 
your information, I have enclosed the full advertisement we have been 
using to publicize the openings.  If you are interested in joining MS, 
I strongly encourage you to apply and we will look forward to following 
up with you.

------------------------------------------------------------
THE FUTURE IS HERE.

Speech Recognition.  Intuitive Graphical Interfaces.
Sophisticated User Agents.  Advanced Operating Systems.
Robust Environments.  World Class Applications.

	Who's Pulling It All Together?

Microsoft.  We're setting the stage for the future of
computing, building a world class research group and
leveraging a solid foundation of object based technology
and scalable operating systems.
	What's more, we're extending the recognition
paradigm, employing advanced processor and RISC-based
architecture, and harnessing distributed networks to
connect users to worlds of information.
	We want to see more than just our own software
running.  We want to see a whole generation of users
realize the future of computing.
	Realize your future with a position in our
Speech Recognition group.


Research Software Design Engineers, Speech Recognition.

Primary responsibilities include designing and developing
User Interface and systems level software for an advanced
speech recognition system.  A minimum of 3 years demonstrated
microcomputer software design and development experience
in C is required.  Knowledge of Windows programming, speech
recognition systems, hidden Markov model theory,  statistics,
DSP,  or user interface development is preferred.  A BA/BS
in computer science or related discipline is required.  An
advanced degree (MS or Ph.D.) in a related discipline is
preferred.


Researchers, Speech Recognition.

Primary responsibilities include research on stochastic
modeling techniques to be applied to an advanced speech
recognition system.  A minimum of 4 years demonstrated
research excellence in the area of speech recognition
or spoken language understanding systems is required.
Knowledge of Windows and real-time C programming for
microcomputers, hidden Markov model theory, decoder
systems design, DSP, and spoken language understanding
is preferred.  A MA/MS in CS or related discipline is
required.  A PhD degree in CS, EE, or related discipline
is preferred.


	Make The Most of Your Future.

At Microsoft, our technical leadership and strong
Software Developers and Researchers stay ahead of the
times, creating vision and turning it into reality.

To apply, send your resume and cover letter, noting
"ATTN: N5935-0223" to:

Surface:
	Microsoft Recruiting
	ATTN: N5935-0223
	One Microsoft Way
	Redmond, WA  98052-6399

Email:
	ASCII ONLY
	y-wait at microsoft.com.us

Microsoft is an equal opportunity employer working to
increase workforce diversity.


From john at cs.uow.edu.au  Fri Feb 26 13:56:21 1993
From: john at cs.uow.edu.au (John Fulcher)
Date: Fri, 26 Feb 93 13:56:21 EST
Subject: submission
Message-ID: <199302260256.AA25570@wraith.cs.uow.edu.au>

COMPUTER STANDARDS & INTERFACES (North-Holland)

Forthcoming Special Issue on ANN Standards

ADDENDUM TO ORIGINAL POSTING

Prompted by enquiries from several people regarding my original Call for
Papers posting, I felt I should offer the following additional information
(clarification).

By ANN "Standards" we do not mean exclusively formal standards (in the ISO,
IEEE, ANSI, CCITT etc. sense), although naturally enough we will be
including papers on activities in these areas.

"Standards" should be interpreted in its most general sense, namely as 
standard APPROACHES (e.g. the backpropagation algorithm & its many variants).
Thus if you have a paper on some (any?) aspect of ANNs, provided it is
prefaced by a summary of the standard approach(es) in that particular area,
it could well be suitable for inclusion in this special issue of CS&I. If in
doubt, post fax or email a copy by April 30th to:

John Fulcher,
Department of Computer Science,
University of Wollongong,
Northfields Avenue,
Wollongong NSW 2522,
Australia.

fax: +61 42 213262
email: john at cs.uow.edu.au.oz


From terry at helmholtz.sdsc.edu  Thu Feb 25 14:57:05 1993
From: terry at helmholtz.sdsc.edu (Terry Sejnowski)
Date: Thu, 25 Feb 93 11:57:05 PST
Subject: Neural Computation 5:2
Message-ID: <9302251957.AA14806@helmholtz.sdsc.edu>

NEURAL COMPUTATION - Volume 5 - Issue 2 - March 1993

Review

	Neural Networks and Non-Linear Adaptive Filtering:
	Unifying Concepts and New Algorithms
		O. Nerrand, P. Roussel-Ragot, L. Personnaz, 
		G. Dreyfus and S. Marcos
		
Notes

	Fast Calculation of Synaptic Conductances
		Rajagopal Srinivasan and Hillel J. Chiel

	The Variance of Covariance Rules for Associative
	Matrix Memories and Reinforcement Learning	
		Peter Dayan and Terrence J. Sejnowski

	Optimal Network Construction by Minimum 
	Description Length
		Gary D. Kendall and Trevor J. Hall

Letters

	A Neural Network Model of Inhibitory Information
	Processing in Aplysia
		Diana E.J. Blazis, Thomas M. Fischer and Thomas J. Carew

	Computational Diversity in a Formal Model of the
	Insect Olfactory Macroglomerulus
		C. Linster, C. Masson, M. Kerszberg, L. Personnaz 
		and G. Dreyfus

	Learning Competition and Cooperation
		Sungzoon Cho and James A. Reggia	

	Constraints on Synchronizing Oscillator Networks
		David E. Cairns, Roland J. Baddeley and Leslie S. Smith

	Learning Mixture Models of Spatial Coherence
		Suzanna Becker and Geoffrey E. Hinton

	Hints and the VC Dimension
		Yaser S. Abu-Mostafa

	Redundancy Reduction as a Strategy for Unsupervised
	Learning
		A. Norman Redlich

	Approximation and Radial-Basis-Function Networks
		Jooyoung Park and Irwin W. Sandberg

	A Polynomial Time Algorithm for Generating Neural
	Networks for Pattern Classification - its Stability 
	Properties and Some Test Results
		Somnath Mukhopadhyay, Asim Roy, Lark Sang Kim 
		and Sandeep Govil

	Neural Networks for Optimization Problems with
	Inequality Constraints - The Knapsack Problem
		Mattias Ohlsson, Carsten Peterson and Bo Soderberg

-----

SUBSCRIPTIONS - VOLUME 5 - BIMONTHLY (6 issues)

______ $40     Student
______ $65     Individual
______ $156    Institution

Add $22 for postage and handling outside USA (+7% GST for Canada).

(Back issues from Volumes 1-4 are regularly available for $28 each
to institutions and $14 each for individuals
Add $5 for postage per issue outside USA (+7% GST for Canada)

MIT Press Journals, 55 Hayward Street, Cambridge, MA 02142.
Tel: (617) 253-2889  FAX: (617) 258-6779  e-mail: hiscox at mitvma.mit.edu

-----


From mark at dcs.kcl.ac.uk  Fri Feb 26 08:25:01 1993
From: mark at dcs.kcl.ac.uk (Mark Plumbley)
Date: Fri, 26 Feb 93 13:25:01 GMT
Subject: King's College London Neural Networks MSc and PhD courses
Message-ID: <17179.9302261325@xenon.dcs.kcl.ac.uk>

Fellow Neural Networkers,

Please post or forward this announcement about our M.Sc. and Ph.D. courses
in Neural Networks  to anyone who might be interested.

Thanks,

Mark Plumbley

-------------------------------------------------------------------------
Dr. Mark D. Plumbley                                 Tel: +44 71 873 2241
Centre for Neural Networks                           Fax: +44 71 873 2017
Department of Mathematics/King's College London/Strand/London WC2R 2LS/UK
-------------------------------------------------------------------------

		      CENTRE FOR NEURAL NETWORKS
				 and
		      DEPARTMENT OF MATHEMATICS
				   
			King's College London
				Strand
			 London WC2R 2LS, UK
				   
	      M.Sc. AND Ph.D. COURSES IN NEURAL NETWORKS

---------------------------------------------------------------------
				   
	 M.Sc. in INFORMATION PROCESSING and NEURAL NETWORKS
	 ---------------------------------------------------
				   
			  A ONE YEAR COURSE
				   
			       CONTENTS
		       Dynamical Systems Theory
			   Fourier Analysis
			  Biosystems Theory
		       Advanced Neural Networks
			    Control Theory
		  Combinatorial Models of Computing
			   Digital Learning
		      Digital Signal Processing
		   Theory of Information Processing
			    Communications
			     Neurobiology
				   
			     REQUIREMENTS
    First Degree in Physics, Mathematics, Computing or Engineering

NOTE:
For 1993/94 we have 3 SERC quota awards for this course.

---------------------------------------------------------------------
				   
		      Ph.D. in NEURAL COMPUTING
		      -------------------------

A 3-year Ph.D. programme in NEURAL COMPUTING is offered to applicants
with a First degree in Mathematics, Computing, Physics or Engineering
(others will also be considered). The first year consists of courses
given under the M.Sc. in Information Processing and Neural Networks
(see attached notice). Second and third year research will be
supervised in one of the various programmes in the development and
application of temporal, non-linear and stochastic features of neurons
in visual, auditory and speech processing. There is also work in
higher level category and concept formation and episodic memory
storage. Analysis and simulation are used, both on PC's SUNs and main
frame machines, and there is a programme on the development and use of
adaptive hardware chips in VLSI for pattern and speed processing.

This work is part of the activities of the Centre for Neural Networks
in the School of Physical Sciences and Engineering, which has over 40
researchers in Neural Networks. It is one of the main centres of the
subject in the U.K.

---------------------------------------------------------------------
				   
  For further information on either of these courses please contact:
				   
			Postgraduate Secretary
		      Department of Mathematics
			King's College London
			       Strand
			 London WC2R 2LS, UK
		        MATHS at OAK.CC.KCL.AC.UK


From stefano at kant.irmkant.rm.cnr.it  Mon Feb  1 03:41:40 1993
From: stefano at kant.irmkant.rm.cnr.it (stefano@kant.irmkant.rm.cnr.it)
Date: Mon, 1 Feb 1993 02:41:40 -0600
Subject: No subject
Message-ID: <9302010841.AA11465@kant.irmkant.rm.cnr.it>


The following paper has been placed in the neuroprose archive 
as nolfi.self-sel.ps.Z  
Instructions for retrieving and printing follow the abstract.


           Self-selection of Input Stimuli for Improving Performance


                  Stefano Nolfi              Domenico Parisi
                        Institute of Psychology, CNR
                        V.le Marx 15, 00137 Rome - Italy
           E-mail: stiva at irmkant.Bitnet  domenico at irmkant.Bitnet


                                  Abstract

  A system which behaves in an environment  can increase its performance 
  level  in  two  different ways.  It  can improve  its ability to react 
  efficiently to any stimulus that may come  from  the environment or it 
  can acquire an ability to expose itself only to a sub-class of stimuli 
  to which it knows how  to  respond efficiently. The possibility that a 
  system can solve  a  task  by  selecting favourable stimuli  is rarely 
  considered  in  designing intelligent systems. In  this paper  we show 
  that this type of ability can play a very powerful role  in explaining 
  a system's performance. 


The paper has been published in: G. A. Bekey (1993), Neural Networks and 
Robotics, Kluwer Academic Publisher.

Sorry, no hard copies are available

Comments are welcome.


Stefano Nolfi
Institute of Psychology, CNR
V.le Marx, 15
00137 - Rome - Italy
email stiva at irmkant.Bitnet
_______________________________________________________________________

Here is an example of how to retrieve this file:

gvax> ftp archive.cis.ohio-state.edu        (or ftp 128.146.8.52)
Connected to archive.cis.ohio-state.edu.
220 archive.cis.ohio-state.edu FTP server ready.
Name: anonymous
331 Guest login ok, send ident as password.
Password:neuron at wherever
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get nolfi.self-sel.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for nolfi.self-sel.ps.Z
226 Transfer complete.
ftp> quit
221 Goodbye.
gvax> uncompress nolfi.self-sel.ps.Z
gvax> lpr nolfi.self-sel.ps


From sbcho%gorai.kaist.ac.kr at DAIDUK.KAIST.AC.KR  Tue Feb  2 11:31:50 1993
From: sbcho%gorai.kaist.ac.kr at DAIDUK.KAIST.AC.KR (Sung-Bae Cho)
Date: Tue, 2 Feb 93 11:31:50 KST
Subject: Paper Announcement
Message-ID: <9302020231.AA01990@gorai.kaist.ac.kr.noname>


Feedforward Neural Network Architectures for
Complex Classification Problems 

To appear in the Fuzzy Systems & AI journal (Romanian Academia Publishing
House). The idea of this paper was presented at the 2nd International
Conference on Fuzzy Logic & Neural Networks, Iizuka-92.

Sung-Bae Cho (sbcho at gorai.kaist.ac.kr) and Jin H. Kim

Center for Artificial Intelligence Research and Computer Science Department
Korea Advanced Institute of Science and Technology
373-1, Koosung-dong, Yoosung-ku, Taejeon 305-701, Republic of Korea

Abstract

This paper presents two neural network design strategies for incorporating
a priori knowledge about a given problem into the feedforward neural
networks. These strategies aim at obtaining tractability and reliability
for solving complex classification problems by neural networks.
The first type strategy based on multistage scheme decomposes the problem
into manageable ones for reducing the complexity of the problem,
and the second type strategy on multiple network scheme combines incomplete
decisions from several copies of networks for reliable decision-making.
A preliminary experiment of recognizing on-line handwriting characters
confirms the superiority relative to a single large neural network classifier.

Key words: neural network architecture design, multistage neural network,
multiple neural networks, synthesis method, voting method, expert judgement,
handwriting character recognition

-----
Now available in the neuroprose archive:
  archive.cis.ohio-state.edu (128.146.8.52)
  pub/neuroprose directory
under the file name
  sbcho.nn_architects.ps.Z
(compressed PostScript).


From ro2m at crab.psy.cmu.edu  Mon Feb  1 11:34:06 1993
From: ro2m at crab.psy.cmu.edu (Randall C. O'Reilly)
Date: Mon, 1 Feb 93 11:34:06 EST
Subject: 2 pdp.cns TR's available
Message-ID: <9302011634.AA06379@crab.psy.cmu.edu.noname>


The following two (related) TR's are now available for electronic ftp
or by hardcopy.  Instructions follow the abstracts.

>>> NOTE THAT THE FTP SITE IS OUR OWN, NOT NEUROPROSE <<<


	Object Recognition and Sensitive Periods: A Computational 
		     Analysis of Visual Imprinting
			
	  		  Randall C. O'Reilly

			    Mark H. Johnson

		      Technical Report PDP.CNS.93.1

		    (Submitted to Neural Computation)

Abstract:

Evidence from a variety of methods suggests that a localized portion
of the domestic chick brain, the Intermediate and Medial Hyperstriatum
Ventrale (IMHV), is critical for filial imprinting.  Data further
suggest that IMHV is performing the object recognition component of
imprinting, as chicks with IMHV lesions are impaired on other tasks
requiring object recognition.  We present a neural network model of
translation invariant object recognition developed from computational
and neurobiological considerations that incorporates some features of
the known local circuitry of IMHV.  In particular, we propose that the
recurrent excitatory and lateral inhibitory circuitry in the model,
and observed in IMHV, produces hysteresis on the activation state of
the units in the model and the principal excitatory neurons in IMHV.
Hysteresis, when combined with a simple Hebbian covariance learning
mechanism, has been shown in earlier work to produce translation
invariant visual representations.  To test the idea that IMHV might be
implementing this type of object recognition algorithm, we have used a
simple neural network model to simulate a variety of different
empirical phenomena associated with the imprinting process.  These
phenomena include reversibility, sensitive periods, generalization,
and temporal contiguity effects observed in behavioral studies of
chicks.  In addition to supporting the notion that these phenomena,
and imprinting itself, result from the IMHV properties captured in the
simplified model, the simulations also generate several predictions
and clarify apparent contradictions in the behavioral data.

-----------------------------------------------------------------------


	The Self-Organization of Spatially Invariant Representations
			
	  		  Randall C. O'Reilly

			  James L. McClelland

		     Technical Report PDP.CNS.92.5

Abstract:

The problem of computing object-based visual representations can be
construed as the development of invariancies to visual dimensions
irrelevant for object identity.  This view, when implemented in a
neural network, suggests a different set of algorithms for computing
object-based visual representations than the ``traditional'' approach
pioneered by Marr, 1981.  A biologically plausible
self-organizing neural network model that develops spatially invariant
representations is presented.  There are four features of the
self-organizing algorithm that contribute to the development of
spatially invariant representations: temporal continuity of
environmental stimuli, hysteresis of the activation state (via
recurrent activation loops and lateral inhibition in an interactive
network), Hebbian learning, and a split pathway between ``what'' and
``where'' representations.  These constraints are tested with a
backprop network, which allows for the evaluation of the individual
contributions of each constraint on the development of spatially
invariant representations.  Subsequently, a complete model embodying a
modified Hebbian learning rule and interactive connectivity is
developed from biological and computational considerations.  The
activational stability and weight function maximization properties of
this interactive network are analyzed using a Lyapunov function
approach.  The model is tested first on the same simple stimuli used
in the backprop simulation, and then with a more complex environment
consisting of right and left diagonal lines.  The results indicate
that the hypothesized constraints, implemented in a Hebbian network,
were capable of producing spatially invariant representations.
Further, evidence for the gradual integration of both featural
complexity and spatial invariance over increasing layers in the
network, thought to be important for real-world applications, was
obtained.  As the approach is generalizable to other dimensions such
as orientation and size, it could provide the basis of a more complete
biologically plausible object recognition system.  Indeed, this work
forms the basis of a recent model of object recognition in the
domestic chick (O'Reilly & Johnson, 1993, TR PDP.CNS.93.1).

-----------------------------------------------------------------------


Retrieval information for pdp.cns TRs:

unix> ftp 128.2.248.152                 # hydra.psy.cmu.edu
Name: anonymous
Password: <email address>
ftp> cd pub/pdp.cns
ftp> binary
ftp> get pdp.cns.93.1.ps.Z		# or, and
ftp> get pdp.cns.92.5.ps.Z
ftp> quit
unix> zcat pdp.cns.93.1.ps.Z | lpr 	# or however you print postscript
unix> zcat pdp.cns.92.5.ps.Z | lpr 

For those who do not have FTP access, physical copies can be requested from
Barbara Dorney <bd1q+ at andrew.cmu.edu>.


From tresp at inf21.zfe.siemens.de  Tue Feb  2 12:29:10 1993
From: tresp at inf21.zfe.siemens.de (Volker Tresp)
Date: Tue, 2 Feb 1993 18:29:10 +0100
Subject: paper in neuroprose
Message-ID: <199302021729.AA24088@inf21.zfe.siemens.de>


The following paper has been placed in the neuroprose archive 
as  tresp.rules.ps.Z
Instructions for retrieving and printing follow the abstract.


-----------------------------------------------------------------

NETWORK STRUCTURING AND TRAINING USING RULE-BASED KNOWLEDGE

-----------------------------------------------------------------


Volker Tresp, 		Siemens, Central Research

Juergen Hollatz, 	TU Muenchen

Subutai Ahmad, 		Siemens, Central Research


Abstract


We  demonstrate in this paper how certain forms of rule-based 
knowledge can be used to prestructure a  neural network of
normalized basis functions  and give a probabilistic
 interpretation of the  network architecture.  We describe several 
ways to assure that rule-based knowledge is  preserved during 
training and  present a method for complexity reduction that
tries to minimize the number of rules
and the number of conjuncts. After training,
the refined rules are extracted and analyzed. 


To appear in:

S. J. Hanson, J. D. Cowan, and C. L. Giles (Eds.), Advances in Neural
Information Processing Systems 5. San Mateo CA: Morgan Kaufmann.


----
Volker Tresp
Siemens AG, Central Research,   		Phone: 	+49 89 636-49408
Otto-Hahn-Ring 6,                            	FAX: 	+49 89 636-3320
W-8000 Munich 83, Germany           	  	E-mail: tresp at zfe.siemens.de


     unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
     Name: anonymous
     Password: neuron
     ftp> cd pub/neuroprose
     ftp> binary
     ftp> get tresp.rules.ps.Z
     ftp> quit
     unix> uncompress  tresp.rules.ps.Z
     unix> lpr -s tresp.rules.ps   (or however you print postscript)


From denis at psy.ox.ac.uk  Wed Feb  3 10:47:36 1993
From: denis at psy.ox.ac.uk (Denis Mareschal)
Date: Wed, 3 Feb 93 15:47:36 GMT
Subject: visual tracking
Message-ID: <9302031547.AA09779@dragon.psych.pdp>

Hi,

	A couple of months ago I sent around a request for further information
concerning higher level connectionist approaches to the development of
visual tracking. I received a number of replies spanning the broad range of 
fields in which neural network research is being conducted.

	I also received a significant number of requests for the resulting
compiled list of references. I am thus posting a list of references resulting 
directly and indirectly from my original request. I have also included a few
relevant psychology review papers.

	Thanks to all those who replied. Clearly this list is not exhaustive
and if anyone reading it notices an ommission which may be of interest I 
would greatly appreciate hearing from them.

			Cheers,
	
				Denis Mareschal
				Department of Experimental Psychology
				South Parks Road
				Oxford University
				Oxford   OX1 3UD
				maresch at black.ox.ac.uk


REFERENCES:


Allen, R. B. (1988), Sequential connectionist networks for answering simple
	questions about a microworld. In: Proceedings of the Tenth Annual 
	Conference of the Cognitive Science Society, pp. 489-495, Hillsdale,
	NJ: Erlbaum.

Baloch, A. A. & Waxman A. M. (1991). Visual learning, adaptive expectations
	and behavioral conditioning of the mobile robot MAVIN, Neural Networks,
	vol. 4, pp. 271-302.

Buck, D. S. & Nelson D. E. (1992). Applying the abductory induction mechanism
	(AIM) to the extrapolation of chaotic time series. In: Proceedings of
	the National Aerospace Electronics Conference (NAECON), 18-22 May,
	Dayton, Ohio, vol. 3, pp 910-915.

Bremner, J. G. (1985). Object tracking and search in infancy: A review of data
	and a theoretical evaluation, Developmental Review, 5, pp.  371-396

Carpenter, G. A. & Grossberg, S. (1992). Neural Networks for Vision and Image
	 Processing, Cambridge, MA: MIT Press.

Cleermans, A., Servan-Schreiber, D. & McClelland, J. L. (1989). Finite state
	automata and simple recurrent networks, Neural Computation,1, pp 372-
	381.

Deno, D. C., Keller, E. L. & Crandall, W. F. (1989). Dynamical neural network
	organization of the visual pursuit system, IEEE Transactions on
	Biomedical Engineering, vol. 36, pp. 85-91.

Dobnikar, A., Likar, A. & Podbregar, D. (1989). Optimal visual tracking with
	artificial neural network. In: First I.E.E. International Conference
	on Artificial Neural Networks (conf. Publ. 313), pp 275-279.

Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, pp.
	179-211.

Ensley, D. & Nelson, D. E. (1992). Applying Cascade-correlation to the 
	extrapolation of chaotic time series. Proceedings of the Third
	Workshop on Neural Networks: Academic/Industrial/NASA/Defense;
	10-12 February, Auburn, Alabama.

Fay, D. A. & Waxman, A. M. (1992). Neurodynamics of real-time image velocity
	extraction. In: G. A. Carpenter & S. Grossberg (Eds), Neural Networks
	for Vision and Image Processing, pp 221-246, Cambridge, MA: MIT Press.

Gordon, Steele, & Rossmiller (1991). Predicting trajectories using recurrent
	neural networks. In: Dagli, Kumara, & Shin (Eds), Intelligent Systems
	Through Artificial Neural Networks, ASME Press. (Sorry that's the best
	I can do for this reference)

Grossberg, S. & Rudd(1989). A neural architecture for visual motion perception:
	Neural Networks, 2, pp. 421-450.	

Koch, C. & Ullman, S. (1985). Shifts in selective visual attention: towards
	the underlying neural circuitry. Human Neurobiology, 4, pp. 219-227.

Lisberger, S. G., Morris, E. J. & Tychsen, L. (1987). Visual motion processing
	and sensory-motor integration for smooth pursuit eye movements,
	Annual Review of Neuroscience, 10, pp. 97-129.

Lumer, E., D. (1992). The phase tracker of attention. In: Proceedings of the 
	Fourteenth Annual Conference of the Cognitive Science Society, pp
	962-967, Hillsdale, NJ: Erlbaum.

Neilson,P. D., Neilson, M. D. & O'Dwyer, N. J. (1993, in press). What limits
	high speed tracking performance?, Human Mouvement Science, 12.

Nelson, D. E., Ensley, D. D. & Rogers, S. K. (1992). Prediction of chaotic time
	series using Cascade Correlation: Effects of number of inputs and
	training set size. In: The Society for Optical Engineering (SPIE),
	Proceeedings of the Applications of Artificial Neural Networks III
	Conference, 21-24 April, Orlando, Florida, vol. 1709, pp 823-829.

Marshall, J. A. (1990). Self-organizing neural networks for perception of
	visual motion, Neural Networks, 3, pp. 45-74.

Martin, W. N. & Aggarwal, J. K. (Eds) (1988). Motion Understanding: Robot
	and Human Vision. Boston: Kluwer Academic Publishers.

Metzgen, Y. & Lehmann D. (1990). Learning temporal sequences by local synaptic
	 changes, Network, 1, pp. 271-302.

Nakayama, K. (1985). Biological image motion processing: A review. Vision
	Research 25, pp 625-660.

Parisi, D., Cecconi, F. & Nolfi, S. (1990). Econets: Neural networks that learn
	in an environment, Network, 1, pp. 149-168.

Pearlmutter, B. A. (1989). Learning state space trajectories in recurrent 
	networks, Neural Computation, 1, pp. 263-269.

Regier, T. (1992). The acquisition of lexical semantics for spatial terms:
	A connectionist model of perceptual categorization. International
	Computer Science Institute (ICSI) Technical Report TR-92-062, Berkely.

Schmidhuber, J. & Huber, R. (1991). Using adaptive sequential neurocontrol
	for efficient learning of translation and rotation invariance. In:
	T. Kohonen, K. Makisara, O. Simula & J. Kangas (Eds), Artificial
	Neural Networks, pp 315-320, North Holland: Elsevier Science.

Schmidhuber, J. & Huber, R. (1991). Learning to generate artificial foveal 
	trajectories for target detection. International Journal of Neural
	Systems, 2, pp. 135-141.

Schmidhuber, J. & Wahnsiedler, R. (1992). Planning simple trajectories using
	neural subgoal generators. Second International Conference on
	Simulations of Adaptive Behavior (SAB92). (Available by ftp from Jordan
	Pollack's Neuroprose Archive).

Sereno, M. E. (1986). Neural network model for the measurement of visual
	motion. Journal of the Optical Sociaty of America A, 3, pp 72.

Sereno, M. E. (1987). Implementing stages of motion analysis in neural.
	Program of the Ninth Annual Conference of the Cognitive Science 
	Society, pp. 405-416, Hillsdala, NJ: Erlbaum.

Servan-Schreiber, D., Cleermans, A. & McClelland, J. L. (1991). Graded state
	machines: The representation of temporal contingencies in simple 
	recurrent networks, 7, pp. 161-193.

Shimohara, K., Uchiyama T. & Tokunaya Y. (1988). Back propagation networks for
	event-driven temporal sequence processing. In: IEEE International
	Conference on Neural Networks (San Diego), vol. 1, pp. 665-672, NY:
	IEEE.

Sutton, R. S. (1988). Learning to predict by the methods of temporal 
	differences, Machine Learning, 3, pp 9-44.

Tolg, S. (1991). A biological motivated system to track moving objectas by
	active camera control. In:T. Kohonen, K. Makisara, O. Simula & J.
	 Kangas (Eds), Artificial Neural Networks, pp 1237-1240, North Holland:
	 Elsevier Science. 

Wechsler, H. (Ed) (1991). Neural Networks for Human and Machine Perception,
	New York: Academic Press.


From gluck at pavlov.rutgers.edu  Wed Feb  3 09:13:20 1993
From: gluck at pavlov.rutgers.edu (Mark Gluck)
Date: Wed, 3 Feb 93 09:13:20 EST
Subject: Preprint: Computational Models of the Neural Bases of Learning and Memory
Message-ID: <9302031413.AA24540@james.rutgers.edu>


For (hard copy) preprints of the following article:

          
  Gluck, M. A. & Granger, R. C. (1993). Computational models of the neural
     bases of learning and memory. Annual Review of Neuroscience. 16: 667-706

ABSTRACT: Advances in computational analyses of parallel-processing
have made computer simulation of learning systems an increasingly useful tool
in understanding complex aggregate functional effects of changes in neural 
systems.  In this article, we review current efforts to develop computational
models of the neural bases of learning and memory, with a focus on the behavioral
implications of network-level characterizations of synaptic change in three
anatomical regions: olfactory (piriform) cortex, cerebellum, and the 
hippocampal formation.

                     ____________________________________

Send US-mail address to: Mark Gluck (Center for Neuroscience, Rutgers-Newark)
                         gluck at pavlov.rutgers.edu


From robtag at udsab.dia.unisa.it  Wed Feb  3 13:22:31 1993
From: robtag at udsab.dia.unisa.it (Tagliaferri Roberto)
Date: Wed, 3 Feb 1993 19:22:31 +0100
Subject: course on Hybrid Systems
Message-ID: <199302031822.AA08460@udsab.dia.unisa.it>


****************    IIASS 1993 February Courses   **************
****************        Last Announcement         **************


A short course on "Hybrid Systems: Neural Nets, Fuzzy Sets and
                   A.I. Systems"

 February 9 - 12

Lecturers:
Dr. Silvano Colombano, NASA Research Center, CA
Prof. Piero Morasso, Univ. Genova, Italia

-----------------------------------------------------------------
Dr. Silvano Colombano
(4 hours)
Introduction: extending the representational power of connectionism
The interim approach: hybrid symbolic connectionist systems
- Distributed
- Localist
- Mixed localist and distributed

(3 hours)
Hybrid Fuzzy Logic connectionist systems
- Classification
- Control
- Reasoning

(2 hours)
A competing approach: classifier systems
Future directions

Prof. Piero Morasso
(2 hours)
Self-organizing Systems and Hybrid Systems

Course schedule

February 9 
3 pm - 6 pm Dr. S. Colombano

February 10
3 pm - 6 pm Dr. S. Colombano

February 11
3 pm - 6 pm Dr. S. Colombano

February 12
3 pm - 5 pm Prof. P. Morasso

The course will be held at IIASS, via G. Pellegrino, Vietri s/m (Sa) Italia.
Participants will pay their own fare and travel expenses. No fees to be
payed.


The short course is sponsored by Progetto Finalizzato CNR "Sistemi 
Informatici e Calcolo Parallelo" and by Contratto quinquennale CNR-IIASS


For any information for the short course, please
contact the IIASS secretariat


                    I.I.A.S.S
                    Via G.Pellegrino, 19
                    I-84019 Vietri Sul Mare (SA)
                    ITALY

                        Tel. +39 89 761167
                        Fax  +39 89 761189

or Dr. Roberto Tagliaferri
                        E-Mail robtag at udsab.dia.unisa.it


From uli at ira.uka.de  Thu Feb  4 12:06:41 1993
From: uli at ira.uka.de (Uli Bodenhausen)
Date: Thu, 04 Feb 93 18:06:41 +0100
Subject: new papers in the neuroprose archive
Message-ID: <mailman.573.1149591274.29955.connectionists@cs.cmu.edu>

The following papers have been placed in the neuroprose archive as  

	bodenhausen.application_oriented.ps.Z
	bodenhausen.architectural_learning.ps.Z

Instructions for retrieving and printing follow the abstracts.

1.)

CONNECTIONIST ARCHITECTURAL LEARNING FOR HIGH PERFORMANCE 
CHARACTER AND SPEECH RECOGNITION

Ulrich Bodenhausen and Stefan Manke

University of Karlsruhe and Carnegie Mellon University

Highly structured neural networks like the Time-Delay Neural 
Network (TDNN) can achieve very high recognition accuracies 
in real world applications like handwritten character and speech 
recognition systems. Achieving the best possible performance 
greatly depends on the optimization of all structural parameters 
for the given task and amount of training data. We propose an 
Automatic Structure Optimization (ASO) algorithm that avoids 
time-consuming manual optimization and apply it to Multi State 
Time-Delay Neural Networks, a recent extension of the TDNN. 
We show that the ASO algorithm can construct efficient architec
tures in a single training run that achieve very high recognition 
accuracies for two handwritten character recognition tasks and 
one speech recognition task. (only 4 pages!)

To appear in the proceedings of the International Conference on 
Acoustics, Speech and Signal Processing (ICASSP) 93, Minneapolis

--------------------------------------------------------------------------
2.)

Application Oriented Automatic Structuring of Time-Delay Neural Networks for 
High Performance Character and Speech Recognition

Ulrich Bodenhausen and Alex Waibel

University of Karlsruhe and Carnegie Mellon University

Highly structured artificial neural networks have been shown 
to be superior to fully connected networks for real-world 
applications like speech recognition and handwritten character 
recognition. These structured networks can be optimized 
in many ways, and have to be optimized for optimal performance. 
This makes the manual optimization very time consuming. 
A highly structured approach is the Multi State 
Time Delay Neural Network (MSTDNN) which uses shifted 
input windows and allows the recognition of sequences of 
ordered events that have to be observed jointly. In this paper 
we propose an Automatic Structure Optimization (ASO) 
algorithm and apply it to MSTDNN type networks. The 
ASO algorithm optimizes all relevant parameters of 
MSTDNNs automatically and was successfully tested with 
three different tasks and varying amounts of training data.
(6 pages, more detailed than the first paper)

To appear in the ICNN 93 proceedings, San Francisco.

--------------------------------------------------------------------------

     unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
     Name: anonymous
     Password: neuron
     ftp> cd pub/neuroprose
     ftp> binary
     ftp> get bodenhausen.application_oriented.ps.Z
     ftp> get bodenhausen.architectural_learning.ps.Z
     ftp> quit
     unix> uncompress  bodenhausen.application_oriented.ps.Z
     unix> uncompress  bodenhausen.architectural_learning.ps.Z
     unix> lpr -s  bodenhausen.application_oriented.ps  (or however you print postscript)
     unix> lpr -s  bodenhausen.architectural_learning.ps

Thanks to  Jordan  Pollack for providing this service!


From moody at chianti.cse.ogi.edu  Thu Feb  4 20:38:08 1993
From: moody at chianti.cse.ogi.edu (John Moody)
Date: Thu, 4 Feb 93 17:38:08 -0800
Subject: NATO ASI:  March 5 Deadline Approaching
Message-ID: <9302050138.AA00659@chianti.cse.ogi.edu>


As the March 5th application deadline is now four weeks away, I am posting this 

notice again.


                  NATO Advanced Studies Institute (ASI) on 

                       Statistics and Neural Networks

                  June 21 - July 2, 1993, Les Arcs, France

Directors:
Professor Vladimir Cherkassky, Department of Electrical Eng., University of 

Minnesota, Minneapolis, MN  55455, tel.(612)625-9597, fax (612)625-
4583, email cherkass at ee.umn.edu
Professor Jerome H. Friedman, Statistics Department, Stanford University, 

Stanford, CA 94309 tel(415)723-9329, fax(415)926-3329, email 

jhf at playfair.stanford.edu
Professor Harry Wechsler, Computer Science Department, George Mason 

University, Fairfax VA22030, tel(703)993-1533, fax(703)993-1521, email 

wechsler at gmuvax2.gmu.edu

List of invited lecturers:  I. Alexander, L. Almeida, A. Barron, A. Buja, E. 

Bienenstock, G. Carpenter, V. Cherkassky, T. Hastie, F. Fogelman, J. 

Friedman, H. Freeman, F. Girosi, S. Grossberg, J. Kittler, R. Lippmann, J. 

Moody, G. Palm, R. Tibshirani, H. Wechsler, C. Wellekens

Objective, Agenda and Participants:  Nonparametric estimation is a problem 

of fundamental importance for many applications involving pattern 

classification and discrimination. This problem has been addressed in 

Statistics, Pattern Recognition, Chaotic Systems Theory, and more recently 

in Artificial Neural Network (ANN) research. This ASI will bring together 

leading researchers from these fields to present an up-to-date review of the 

current state-of-the art, to identify fundamental concepts and trends for 

future development, to assess the relative advantages and limitations of 

statistical vs neural network techniques for various pattern recognition 

applications, and to develop a coherent framework for the joint study of 

Statistics and ANNs. Topics range from theoretical modeling and adaptive 

computational methods to empirical comparisons between statistical and 

neural network techniques. Lectures will be presented in a tutorial manner to 

benefit the participants of ASI. A two-week programme is planned, complete 

with lectures, industrial/government sessions, poster sessions and social 

events. It is expected that over seventy students (which can be researchers or 

practitioners at the post-graduate or graduate level) will attend, drawn from 

each NATO country and from Central and Eastern Europe. The proceedings 

of ASI will be published by Springer-Verlag.

Applications: Applications for participation at the ASI are sought. 

Prospective students, industrial or government participants should send a 

brief statement of what they intend to accomplish and what form their 

participation would take. Each application should include a curriculum vitae, 

with a brief summary of relevant scientific or professional accomplishments, 

and a documented statement of financial need (if funds are applied for). 

Optionally, applications may include a one page summary for making a 

short presentation at the poster session. Poster presentations focusing on 

comparative evaluation of statistical and neural network methods and 

application studies are especially sought. For junior applicants, support 

letters from senior members of the professional community familiar with the 

applicant's work would strengthen the application. Prospective participants 

from Greece, Portugal and Turkey are especially encouraged to apply.

Costs and Funding: The estimated cost of hotel accommodations and meals 

for the two-week duration of the ASI is US$1,600. In addition, participants 

from industry will be charged an industrial registration fee, not to exceed 

US$1,000. Participants representing industrial sponsors will be exempt 

from the fee. We intend to subsidize costs of participants to the maximum 

extent possible by  available funding.  Prospective participants should also 

seek support from their national scientific funding agencies. The agencies, 

such as the American NSF or the German DFG, may provide some ASI 

travel funds upon the recommendation of an ASI director. Additional funds 

exist for students from Greece, Portugal and Turkey. We are also seeking 

additional sponsorship of ASI. Every sponsor will be fully acknowledged at 

the ASI site as well as in the printed proceedings. 


Correspondence and Registration:  Applications  should be forwarded to 

Dr. Cherkassky at the above address. Applications arriving after March 5, 

1993 may not be considered. All approved applicants will be informed of the 

exact registration arrangements. Informal email inquiries can be addressed to 

Dr. Cherkassky at   nato_asi at ee.umn.edu 


From takagi at diva.berkeley.edu  Thu Feb  4 21:48:15 1993
From: takagi at diva.berkeley.edu (Hideyuki Takagi)
Date: Thu, 4 Feb 93 18:48:15 -0800
Subject: BISC Special Seminar
Message-ID: <9302050248.AA02922@diva.Berkeley.EDU>

Dear Colleagues:

We will hold the BISC Special Seminar at UC Berkeley one day before
FUZZ-IEEE'93/ICNN'93. Please forward the following announcement to widely.

Hideyuki TAKAGI


-----------------------------------------------------------------------

                 EXTENDED BISC SPECIAL SEMINAR

	     10:30AM-5:45PM, March 28 (Sunday), 1993
	     Sibley Auditorium (210) in Bechtel Hall
	   University of California, Berkeley CA 94720


BISC (Berkeley Initiative for Soft Computing) of UC Berkeley will hold
a Special Seminar to take advantage of the presence in the San Francisco
area of the luminaries attending FUZZ-IEEE'93/ICNN'93. We hope that your
schedule will allow you to participate. 


PROGRAM:

10:30-11:00 Lotfi A. Zadeh (Univ. of California, Berkeley)
              Soft Computing
11:00-12:00 Hidetomo Ichihashi / Univ. of Osaka Prefecture
              Neuro-Fuzzy Approaches to Optimization and Inverse Problems
12:00- 1:30        (lunch)
 1:30- 2:30 Philippe Smets (Iridia Universite Libre de Bruxelles)
              Imperfect information : Imprecision - Uncertainty
 2:30- 3:30 Teuvo Kohonen (Helsinki University of Technology)
              Competitive-Learning Neural Networks are closest to Biology
 3:30- 3:45        (break)
 3:45- 4:45 Michio Sugeno (Tokyo Institute of Technology)
              Fuzzy Modeling towards Qualitative Modeling
 4:45- 5:45 Hugues Bersini (Iridia Universite Libre de Bruxelles)
              The Immune Learning Mechanisms: Reinforcement, Recruitment
              and their Applications


REGISTRATION:

Attendance is free and registration is not required.


HOW TO GET HERE:

[BART subway from San Francisco downtown]
The closest station to the SF Hilton Hotel is the Powell Str. Station.
Berkeley is a safe 24 minute ride from the Powell Str. Station. You must
catch the Concord bound train and transfer onto a Richmond bound train
at the Oakland City Center-12th Str. Station. Trains on Sunday
rendezvous every 20 minutes as indicated below.
	Powell      12th Str.       Berkeley
	8:17  ----  8:31 8:31  ----  8:41
	8:37  ----  8:51 8:51  ----  9:01
	8:57  ----  9:11 9:11  ----  9:21
	9:17  ----  9:31 9:31  ----  9:41
	9:37  ----  9:51 9:51  ---- 10:01
It takes 15-20 minutes on foot from the Berkeley BART Station to reach
Bechtel Hall, which is located on the North-East part of campus. 
Bechtel Hall is just North of Evans Hall, home of the Computer Science
Division. North Gate is the nearest campus gate.
[TAXI]
You can take a taxi from the front of the Berkeley BART Station. Ask the
taxi driver to enter from East Gate on campus and let you off at Mining
Circle. The tallest building adjacent to the circle is Evans Hall.
Bechtel Hall is just north of the Evans.
[CAR]
Get off at the University Ave. exit from Interstate 80. The east end of
University Ave. is the West Gate to UC Berkeley. Most street parking is
free on Sunday, but it may be scarce and remember to read the signs. If
you feel you must park in a lot, we recommend UCB Parking Structure H
which is located at the corner of Hearst and La Loma Avenues. You must
buy an all day parking ticket from the vending machine located on the
2nd level (the only one in the structure). You need to prepare 12
quarters. Illegal parking in Berkeley is expensive.


CONTACT ADDRESS:

Hideyuki TAKAGI, Coordinator of this seminar (takagi at cs.berkeley.edu)
Lotfi A. Zadeh,  Director of BISC            (zadeh at cs.berkeley.edu)

Computer Science Division
University of California at Berkeley
Berkeley, CA 94720
FAX <+1>510-642-5775


From ira at linus.mitre.org  Fri Feb  5 10:06:53 1993
From: ira at linus.mitre.org (ira@linus.mitre.org)
Date: Fri, 5 Feb 93 10:06:53 -0500
Subject: vision position posting
Message-ID: <9302051506.AA09737@ellington.mitre.org>


	Neural Network Vision Research Position

The MITRE Corporation is looking for a Vision Modeler with an
excellent math background, knowledge of signal processing techniques,
considerable experience modeling biological low-level vision processes
and broad knowledge of current neural network learning algorithm
research. 

This is an *applied* research position which has as its goal the
application of vision modeling techniques to real tasks such as 2D and
3D object recognition in synthetic and real world imagery. This
position requires software implementation of models in C language. The
position may also involve management responsibilities.

The position is located in Bedford, Massachusetts. We are looking
for someone with availability within the next two months.  Interested
applicants should send a resume and representative publications to:


Ira Smotroff
Lead Scientist
The MITRE Corporation
MS K331
202 Burlington Rd.
Bedford, MA  01730-1420


From heiniw at sun1.eeb.ele.tue.nl  Fri Feb  5 09:56:17 1993
From: heiniw at sun1.eeb.ele.tue.nl (Heini Withagen)
Date: Fri, 5 Feb 1993 15:56:17 +0100 (MET)
Subject: Does backprop need the derivative ??
Message-ID: <9302051456.AA02038@sun1.eeb.ele.tue.nl>

A non-text attachment was scrubbed...
Name: not available
Type: text
Size: 1054 bytes
Desc: not available
Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/b760eda0/attachment-0001.ksh

From wallyn at capsogeti.fr  Fri Feb  5 13:05:20 1993
From: wallyn at capsogeti.fr (Alexandre Wallyn)
Date: Fri, 5 Feb 93 19:05:20 +0100
Subject: Neural networks in Product modelling
Message-ID: <9302051805.AA13434@gizmo>


I am trying to evaluate the state of the art in the connectionist
applications in Product Modelling (or engineering design).

After looking in several journals (Neural Networks, IJCNN proceedings,
Neuro-Nimes, and some history of connectionist mailing list), I only
found:

"Neural Network in Engineering Design" (H.Adeli, IJCNN 1990) (very general)
Indirect quotations of general work in AI Wright University (1988)
Modelling of MOS components in University of Dortmund (1990)
and CadChem product of AIWare for product modelling and chemical formulation
(seem to be uses by General Tire and Good Year).

Are these applications in product modelling so scarce, or are they published
in other forums ?

I thank you in advance for your help.

I will, of course, publish a summary of the replies.

Alexandre Wallyn
CAP GEMINI INNOVATION
86-90, rue Thiers
92513 BOULOGNE
FRANCE
wallyn at capsogeti.fr


From ira at linus.mitre.org  Fri Feb  5 10:14:46 1993
From: ira at linus.mitre.org (ira@linus.mitre.org)
Date: Fri, 5 Feb 93 10:14:46 -0500
Subject: vision position: US Citizens only
Message-ID: <9302051514.AA09747@ellington.mitre.org>


Sorry to clutter your mail boxes. 

The Neural Network Vision Position at The MITRE Corporation is open
only to US Citizens.

Ira Smotroff


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Fri Feb  5 22:55:28 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Fri, 05 Feb 93 22:55:28 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Fri, 05 Feb 93 15:56:17 +0100.
             <9302051456.AA02038@sun1.eeb.ele.tue.nl> 
Message-ID: <mailman.574.1149591274.29955.connectionists@cs.cmu.edu>


    In his paper, 'An Empirical Study of Learning Speed in Back-Propagation
    Networks', Scott E. Fahlmann shows that with the encoder/decoder problem
    it is possible to replace the derivative of the transfer function by
    a constant. I have been able to reproduce this example. However, for
    several other examples, it was not possible to get the network 
    converged using a constant for the derivative.
    
Interesting.  I just tried this on encoder problems and a couple of other
simple things, and leapt to the conclusion that it was a general
phenomenon.  It seems plausible to me that any "derivative" function that
preserves the sign of the error and doesn't have a "flat spot" (stable
point of 0 derivative) would work OK, but I don't know of anyone who has
made an extensive study of this.

I'd be interested in hearing more about the problems you've encountered and
about any results others send to you.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From marwan at sedal.su.oz.au  Sat Feb  6 07:49:53 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Sat, 6 Feb 1993 23:49:53 +1100
Subject: Does backprop need the derivative ??
Message-ID: <9302061249.AA17234@sedal.sedal.su.OZ.AU>

As the intention of the inquirer is the analog implementation of
backprop, I see two problems: 1- the question whether the derivative can
be replaced by a constant, and more importantly 2- whether the precision
of the analog implementation will be high enough for backprop to work.

Regarding (1), it is likely as Scott Fahlman suggested any derivative that
"preserves" the error sign may do the job. The question however is the
implication in terms of convergence speed, and the comparison thereof with
perturbation type training methods.

Regarding (2), there has been several reports indicating that
backpropagation simply does not work when the number of bits is reduced
towards 6-8 bits! 

	Marwan

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From jlm at crab.psy.cmu.edu  Sat Feb  6 08:39:43 1993
From: jlm at crab.psy.cmu.edu (James L. McClelland)
Date: Sat, 6 Feb 93 08:39:43 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@sef-pmax.slisp.cs.cmu.edu's message of Fri, 05 Feb 93 22:55:28 EST <Added.8fQtZ_i00Udb4=e04m@andrew.cmu.edu>
Message-ID: <9302061339.AA19977@crab.psy.cmu.edu.noname>


Re the discussion concerning replacing the derivative of
the activations of units with a constant:

Some work has been done using the activation rather than the
derivative of the activation by Nestor Schmajuk.  He is interested
in biologically plausible models and tends to keep hidden units in the
bottom half of the sigmoid.  In that case they can be approximated
by exponentials and so the derivative can be approximated by the
activation.

Approx ref: Schmajuk and DiCarlo, Psychological Review, 1992

 - Jay McClelland


From ljubomir at darwin.bu.edu  Sat Feb  6 11:17:56 1993
From: ljubomir at darwin.bu.edu (Ljubomir Buturovic)
Date: Sat, 6 Feb 93 11:17:56 -0500
Subject: Does backprop need the derivative ??
Message-ID: <9302061617.AA13641@darwin.bu.edu>

Mr. Heini Withagen says: 

> I am working on an analog chip implementing a feedforward
> network and I am planning to incorporate backpropagation learning
> on the chip. If it would be the case that the backpropagation
> algorithm doesn't need the derivative, it would simplify the
> design enormously.

We have trained multilayer perceptron without derivatives,
using simplex algorithm for multidimensional optimization
(not to be confused with simplex algorithm for linear
programming). From our experiments, it turns out that it 
can be done, however the number of weights is seriously
limited, since the memory complexity of simplex is N^2,
where N is the total number of variable weights in the
network. See reference for further details (the reference
is available as a LaTeX file from ljubomir at darwin.bu.edu). 
 
Lj. Buturovic, Lj. Citkusev, ``Back Propagation and
Forward Propagation,'' in Proc. Int. Joint Conf. Neural
Networks, (Baltimore, MD), 1992, pp. IV-486 -- IV-491.
 
Ljubomir Buturovic
Boston University 
BioMolecular Engineering Research Center
36 Cummington Street, 3rd Floor
Boston, MA 02215

office: 617-353-7123
home:   617-738-6487 


From gary at cs.ucsd.edu  Sat Feb  6 11:20:57 1993
From: gary at cs.ucsd.edu (Gary Cottrell)
Date: Sat, 6 Feb 93 08:20:57 -0800
Subject: Does backprop need the derivative ??
Message-ID: <9302061620.AA29550@odin.ucsd.edu>

I happen to know it doesn't work for a more complicated encoder 
problem: Image compression. When Paul Munro & I were first doing
image compression back in 86, the error would go down and then
back up! Rumelhart said: "there's a bug in your code" and indeed
there was: we left out the derivative on the hidden units. -g.


From radford at cs.toronto.edu  Sun Feb  7 12:24:15 1993
From: radford at cs.toronto.edu (Radford Neal)
Date: Sun, 7 Feb 1993 12:24:15 -0500
Subject: Does backprop need the derivative?
Message-ID: <93Feb7.122429edt.227@neuron.ai.toronto.edu>

Other posters have discussed, regarding backprop...

> ... the question whether the derivative can be replaced by a constant, 

To clarify, I believe the intent is that the "constant" have the same
sign as the derivative, but have constant magnitude.

Marwan Jabri says...

> Regarding (1), it is likely as Scott Fahlman suggested any derivative 
> that "preserves" the error sign may do the job. 

One would expect this to work only for BATCH training. On-line training
approximates the batch result only if the net result of updating the 
weights on many training cases mimics the summing of derivatives in
the batch scheme. This will not be the case if a training case where the
derivative is +0.00001 counts as much as one where it is +10000.

This is not to say it might not work in some cases. There's just no reason
to think that it will work generally.

    Radford Neal


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Sun Feb  7 12:56:03 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Sun, 07 Feb 93 12:56:03 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Sat, 06 Feb 93 23:49:53 +1100.
             <9302061249.AA17234@sedal.sedal.su.OZ.AU> 
Message-ID: <mailman.575.1149591274.29955.connectionists@cs.cmu.edu>


    As the intention of the inquirer is the analog implementation of
    backprop, I see two problems: 1- the question whether the derivative can
    be replaced by a constant, and more importantly 2- whether the precision
    of the analog implementation will be high enough for backprop to work.
    
    ...
    
    Regarding (2), there has been several reports indicating that
    backpropagation simply does not work when the number of bits is reduced
    towards 6-8 bits! 
    
It is true that several studies show a sudden failure of backprop learning
when you use fixnum arithmetic and reduce the number of bits per word.  The
point of failure seems to be problem-specific, but is often around 10-14
bits (incuding sign).

Marcus Hoehfeld and I studied this issue and found that the source of the
failure was a quantization effect: the learning algorithm needs to
accumulate lots of small steps, for weight-update or whatever, and since
these are smaller than half the low-order bit, it ends up accumulating a
lot of zeros instead.  We showed that if a form of probabilisitic rounding
(dithering) is used to smooth over these quantization steps, learning
continues on down to 4 bits or fewer, with only a gradual degradation in
learning time, number of units/weights required, and quality of the result.
This study used Cascor, but we believe that the results hold for backprop
as well.

    Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
    Numerical Precision Using the Cascade-Correlation Learning Algorithm"
    in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
    602-611.

Of course, a learning system implemented in analog hardware might have only
a few bits of accuracy due to noise and nonlinearity in the circuits, but
it wouldn't suffer from this quantization effect, since you get a sort of
probabilistic dithering for free.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From kolen-j at cis.ohio-state.edu  Sun Feb  7 11:31:20 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Sun, 7 Feb 93 11:31:20 -0500
Subject: Does backprop need the derivative ?? 
In-Reply-To: "James L. McClelland"'s message of Sat, 6 Feb 93 08:39:43 EST <9302061339.AA19977@crab.psy.cmu.edu.noname>
Message-ID: <9302071631.AA19877@pons.cis.ohio-state.edu>


Back prop does not need THE derivative.  I have some empirical results
which show that most of the internal mathematical operators of back prop
can be replaced by qualitatively similar operators.  I'm not talking about
reducing bit width, as most of the literature does.  I was interested in
what happens when you replace multiplication with maximum, the sigmoid with
a generic bump, etc.  What was suprising was that all the tweeks basically 
worked.  Back prop is "functionally" stable in the sense that the learning
functional ability remains regardless of minor shifts in internal
organization.  The reason that the reduced accuracy results are the way
that they are  can be traced to the loss of continuity rather than the 
loss of bits.

John Kolen


From gary at cs.UCSD.EDU  Sun Feb  7 13:09:19 1993
From: gary at cs.UCSD.EDU (Gary Cottrell)
Date: Sun, 7 Feb 93 10:09:19 -0800
Subject: Does backprop need the derivative ??
Message-ID: <9302071809.AA00283@odin.ucsd.edu>

The sign is always positive. Hence not using it is an approximation that
preserves the sign. -g.


From Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU  Sun Feb  7 13:02:42 1993
From: Scott_Fahlman at SEF-PMAX.SLISP.CS.CMU.EDU (Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU)
Date: Sun, 07 Feb 93 13:02:42 EST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Your message of Sat, 06 Feb 93 08:20:57 -0800.
             <9302061620.AA29550@odin.ucsd.edu> 
Message-ID: <mailman.576.1149591274.29955.connectionists@cs.cmu.edu>


    I happen to know it doesn't work for a more complicated encoder 
    problem: Image compression. When Paul Munro & I were first doing
    image compression back in 86, the error would go down and then
    back up! Rumelhart said: "there's a bug in your code" and indeed
    there was: we left out the derivative on the hidden units. -g.

I can see why not using the true derivative of the sigmoid, but just an
approximation that preserves the sign, might cause learning to bog down,
but I don't offhand see how it could cause the error to go up, at least in
a net with only one hidden layer and with a monotonic activation function.

I wonder if this problem would also occur in a net using the "sigmoid prime
offset", which adds a small constant to the derivative of the sigmoid.  I
haven't seen it.

-- Scott

===========================================================================
Scott E. Fahlman			Internet:  sef+ at cs.cmu.edu
Senior Research Scientist		Phone:     412 268-2575
School of Computer Science              Fax:       412 681-5739
Carnegie Mellon University		Latitude:  40:26:33 N
5000 Forbes Avenue			Longitude: 79:56:48 W
Pittsburgh, PA 15213
===========================================================================


From marwan at sedal.su.oz.au  Sun Feb  7 18:13:36 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Mon, 8 Feb 1993 10:13:36 +1100
Subject: Does backprop need the derivative ??
Message-ID: <9302072313.AA24874@sedal.sedal.su.OZ.AU>

> It is true that several studies show a sudden failure of backprop learning
> when you use fixnum arithmetic and reduce the number of bits per word.  The
> point of failure seems to be problem-specific, but is often around 10-14
> bits (incuding sign).
> 
> Marcus Hoehfeld and I studied this issue and found that the source of the
> failure was a quantization effect: the learning algorithm needs to
> accumulate lots of small steps, for weight-update or whatever, and since
> these are smaller than half the low-order bit, it ends up accumulating a
> lot of zeros instead.  We showed that if a form of probabilisitic rounding
> (dithering) is used to smooth over these quantization steps, learning
> continues on down to 4 bits or fewer, with only a gradual degradation in
> learning time, number of units/weights required, and quality of the result.
> This study used Cascor, but we believe that the results hold for backprop
> as well.
> 
>     Marcus Hoehfeld and Scott E. Fahlman (1992) "Learning with Limited
>     Numerical Precision Using the Cascade-Correlation Learning Algorithm"
>     in IEEE Transactions on Neural Networks, Vol. 3, no. 4, July 1992, pp.
>     602-611.
> 

Yun Xie and I have tried simular experiments on the Sonar and ECG data,
and it is fair to say that standard backprop gives up about 10 bits [2].
In a closer look at the quantisation effects you would find that the
signal/noise ratio depends on the number of layers[1]. As you go deeper you
require less precision. This would be a source of variation between
backprop and cascor.

> Of course, a learning system implemented in analog hardware might have only
> a few bits of accuracy due to noise and nonlinearity in the circuits, but
> it wouldn't suffer from this quantization effect, since you get a sort of
> probabilistic dithering for free.
> 

Hmmm... precision also suffers from number of operations in analog
implementations. The free dithering you get is every where including in
your errors! The gradient descent turns into a yoyo. This is well 
explained in [2, 3].

The best way of using backprop or more efficiently, conjuguate gradient is
to do the training off-chip and then to download the (truncated) weights.
Our experience in the training of real analog chips shows that some
further in-loop training is required. Note our chips were ultra low power
and you may have less problems with strong inversion implementations.

Regarding the idea of Simplex that has been suggested. The inquirer was
talking about on-chip learning. Have you in your experiments done a
limited precision Simplex? Have you tried it on a chip in in-loop mode?
Philip Leong here has tried a similar idea (I think) a while back.  The
problem with this approach is that you need to a have a very good guess at
your starting point as the Simplex will move you from one vertex (feasible
solution) to another while expanding the weight solution space. 
Philip's experience is that it does work for small problems when you have
a good guess!

At the last NIPS, there were 4 posters about learning in or for analog
chips. The inquirer may wish to consult these papers (two at least were 
advertised deposited in the neuroprose archive, one by Gert Cauwengergh 
and one by Barry Flower and I).

So far, for us, the most reliable analog chips training algorithm has been the
combined search algorithm (modified weight perturbation and partial random
search) [3]. I will be very interested in hearing more about experiments
where analog chips are trained.

Marwan

[1] Yun Xie and M. Jabri, Analysis of the Effects of Quantization in
Multi-layer Neural Networks Using A Statistical Model, IEEE
Transactions on Neural Networks, Vol. 3, No. 2, pp. 334-338, March, 1992.

[2] M. Jabri, S. Pickard, P. Leong and Y. Xie, Algorithms and
Implementation Issues in Analog Low Power Learning Neural Nertwork
Chips,  To appear in the Intenational Journal on VLSI
Signal Processing, early 1993, USA.

[3] Y. Xie and M. Jabri, On the Training of Limited Precision Multi-layer
Perceptrons. Proceedings of the International Joint Conference on
Neural Networks, pp III-942-947, July 1992, Baltimore, USA.

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From takagi at diva.berkeley.edu  Sun Feb  7 14:36:59 1993
From: takagi at diva.berkeley.edu (Hideyuki Takagi)
Date: Sun, 7 Feb 93 11:36:59 -0800
Subject: attendance restriction at BISC Special Seminar
Message-ID: <9302071936.AA00803@diva.Berkeley.EDU>

	           ORGANIZATIONAL CHANGE 
		           in 
		Extended BISC Special Seminar

	     10:30AM-5:45PM, March 28 (Sunday), 1993
	     Sibley Auditorium (210) in Bechtel Hall
	   University of California, Berkeley CA 94720

Dear Colleagues:

This is to inform you of an organizational change in the Extended BISC
Special Seminar which was announced on February 4. 

Most of speakers in the regular BISC Seminar are associated with
companies and universities in the Bay area. The motivations for the
Extended BISC Seminar was to take  advantage of the presence in the Bay
area of some of the leading contributors to fuzzy logic and neural 
network theory from abroad, who will be participating in FUZZ-IEEE'93 /
ICNN'93.

A problem which became apparent is that because both the Extended BISC
Seminar and the FUZZ-IEEE'93/ICNN'93 tutorials are scheduled to take
place on the same day, the BISC Seminar may have an adverse effect on
registration for the conference tutorilas.

To resolve this problem, it was felt that it may be nessary to restrict
attendance at the Extended BISC Seminar to students and faculty in the
Bay area who normally attend the BISC Seminar. In this way, the Extended
BISC Seminar would serve its usual role and at the same time bring to
the Berkeley Campus some of the leading contributors to soft computing. 

The publicity for the Extended BISC Seminar will state that attendance
is limited to students and faculty in the Bay area.

Sincerely,


BISC (Berkeley Initiative for Soft Computing)
---------------------------------------------


From mav at cs.uq.oz.au  Sun Feb  7 19:33:21 1993
From: mav at cs.uq.oz.au (Simon Dennis)
Date: Mon, 08 Feb 93 10:33:21 +1000
Subject: Learning in Memory Technical Report
Message-ID: <9302080033.AA10081@uqcspe.cs.uq.oz.au>


The following technical report is available for anonymous ftp.


TITLE: Integrating Learning into Models of Human Memory: 
	The Hebbian Recurrent Network

AUTHORS: Simon Dennis and Janet Wiles

ABSTRACT: 

We develop an interactive model of human memory called the Hebbian
Recurrent Network (HRN) which integrates work in the mathematical
modeling of memory with that in error correcting connectionist
networks.  It incorporates the matrix model (Pike, 1984; Humphreys,
Bain & Pike, 1989) into the Simple Recurrent Network (SRN, Elman,
1989).  The result is an architecture which has the desirable memory
characteristics of the matrix model such as low interference and
massive generalization but which is able to learn appropriate encodings
for items, decision criteria and the control functions of memory which
have traditionally been chosen a priori in the mathematical memory
literature.  Simulations demonstrate that the HRN is well suited to a
recognition task inspired by typical memory paradigms.  When compared
against the SRN the HRN is able to learn longer lists, generalizes from
smaller training sets, and is not degraded significantly by increasing
the vocabulary size.


Please mail correspondence to mav at cs.uq.oz.au

Ftp Instructions:


$ ftp exstream.cs.uq.oz.au
Connected to exstream.cs.uq.oz.au.
220 exstream FTP server (Version 6.12 Fri May 8 16:33:17 EST 1992) ready.
Name (exstream.cs.uq.oz.au:mav): anonymous
331 Guest login ok, send e-mail address as password.
Password:
230-                 Welcome to ftp.cs.uq.oz.au
230-This is the University of Queensland Computer Science Anonymous FTP server. 
230-For people outside of the department, please restrict your usage to outside
230-of the hours 8am to 6pm.
230-
230-The local time is Mon Feb  8 10:26:05 1993
230-
230 Guest login ok, access restrictions apply.
ftp> cd pub/TECHREPORTS/department
250 CWD command successful.
ftp> bin
200 Type set to I.
ftp> get TR0252.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for TR0252.ps.Z (160706 bytes).
226 Transfer complete.
local: TR0252.ps.Z remote: TR0252.ps.Z
160706 bytes received in 0.71 seconds (2.2e+02 Kbytes/s)
ftp> quit
221 Goodbye.
$ 

Printing Instructions:

$ zcat TR0252.ps.Z | lpr


From efiesler at idiap.ch  Mon Feb  8 03:22:31 1993
From: efiesler at idiap.ch (E. Fiesler)
Date: Mon, 8 Feb 93 09:22:31 +0100
Subject: Does backprop need the derivative ??
Message-ID: <9302080822.AA22484@idiap.ch>

Marwan Jabri wrote:

> Date: Sat, 6 Feb 1993 23:49:53 +1100
> From: Marwan Jabri <marwan at sedal.su.oz.au>
> Subject: Re: Does backprop need the derivative ??
> 
> As the intention of the inquirer is the analog implementation of
> backprop, I see two problems: 1- the question whether the derivative can
> be replaced by a constant, and more importantly 2- whether the precision
> of the analog implementation will be high enough for backprop to work.
> 
> Regarding (1), ...
> 
> Regarding (2), there has been several reports indicating that
> backpropagation simply does not work when the number of bits is reduced
> towards 6-8 bits! 

This is often reported for standard backpropagation. However, a simple
extension of backpropagation can make it work for any precision;
up to 1-2 bits. I'll append the reference(s) below.

				E. Fiesler
				Directeur de Recherche
				IDIAP
				Case postale 609
				CH-1920 Martigny
				Switzerland


@InProceedings{Fiesler-90,
        Author       = "E. Fiesler and A. Choudry and H. J. Caulfield",
        Title        = "A Weight Discretization Paradigm for Optical
                        Neural Networks",
        BookTitle    = "Proceedings of the International Congress on
                        Optical Science and Engineering",
        Volume       = "SPIE-1281",
        Pages        = "164--173",
        Publisher    = "The International Society for Optical
                        Engineering Proceedings",
        Address      = "Bellingham, Washington, U.S.A.",
        Year         = "1990",
        ISBN         = "0-8194-0328-8",
        Language     = "English" }

@Article{Fiesler-93,
        Author       = "E. Fiesler and A. Choudry and H. J. Caulfield",
        Title        = "A Universal Weight Discretization Method for
                        Multi-Layer Neural Networks",
        Journal      = "IEEE Transactions on Systems, Man, and Cybernetics
                        (IEEE-SMC)",
        Publisher    = "The Institute of Electrical and Electronics Engineers
                        (IEEE), Inc.",
        Address      = "New York, New York",
        Year         = "1993",
        ISSN         = "0018-9472",
        Language     = "English",
        Note         = "Accepted for publication." }


From annette at cdu.ucl.ac.uk  Mon Feb  8 05:13:06 1993
From: annette at cdu.ucl.ac.uk (Annette Karmiloff-Smith)
Date: Mon, 8 Feb 93 10:13:06 GMT
Subject: Cognitive Development for Connectionists
Message-ID: <9302081013.AA14475@cdu.ucl.ac.uk>


Below are details of two articles and a book which may be
of interest to connectionists:

A.Karmiloff-Smith (1992), Connection Science, Vo.4, Nos. 3 & 4, 253-
269.
NATURE, NURTURE ANDS PDP: Preposterous Developmental Postulates?
(N.B. the question mark - I end on: Promising Developmental
Postulates!)

Abstract:  In this article I discuss the nature/nurture debate in terms
of evidence and theorizing from the field of cognitive development, and
pinpoint various problems where the Connectionist framework needs to
be further explored from this perspective.  Evidence from normal and
abnormal developmental phenotypes points to some domain-specific
constraints on early learning.  Yet, by invoking the dynamics of
epigenesis, I avoid recourse to a strong Nativist stance and remain
within the general spirit of Connectionism.
_____________________________________________________________

A. Karmiloff-Smith (1992)  Technical Report TR.PDP.CNS.92.7, Carnegie
Mellon University, Pittsburgh.
ABNORMAL PHENOTYPES AND THE CHALLENGES THEY POSE TO
CONNECTIONIST MODELS OF DEVELOPMENT

Abstract:  The comparison of different abnormal phenotypes (e.g.
Williams syndrome, Down syndrome, autism, hydrocephalus with
associated myelomeningocele) raises a number of questions about
domain-general versus domain-specific processes and suggests that
development stems from domain-specific predispositions which
channel infantsU attention  to proprietary inputs.  This is not to be
confused with a strong Nativist position.  Genetically fully specified
modules are not the starting point of development.  Rather, a process of
gradual modularization builds on skeletal domain-specific
predispositions (architectural and/or representational) which give the
normal infant a small but significant head-start. It is argued that Down
syndrome infants may lack these head-starts, whereas individuals with
Williams syndrome, autism and hydrocephalus with associated
myelomeningocele have a head-start in selected domains only, leading
to different cognitive profiles despite equivalent input.  Stress is placed
on the importance of exploring a developing system, rather than a
lesioned adult system. The position developed in the paper not only
contrasts with the strong Nativist stance, but also with the view that
domain-general processes are simply applied to whatever inputs the
child encounters. The comparison of different phenotypical outcomes is
shown to pose interesting challenges to connectionist simulations of
development.
______________________________________________________________

A.Karmiloff-Smith (1992) BEYOND MODULARITY: A DEVELOPMENTAL
PERSPECTIVE ON COGNITIVE SCIENCE.  MIT Press/Bradford Books.

A book intended to excite connectionists and other non-
developmentalists about the essential role that a developmental
perspective has in understanding the special nature of human cognition
compared to other species.
Contents:
1. Taking development seriously
2. The child as a linguist
3. The child as a physicist
4. The child as a mathematician
5. The child as a psychologist
6. The child as a notator
7. Nativism, domain specificity and PiagetUs constructivism
8. Modelling development: representational redescription
    and connectionism
9. Concluding speculations

Reprints of articles obtainable from:
Annette Karmiloff-Smith
Medical Research Council
Cognitive Development Unit
London
WC1H 0AH.
U.K.


From SCHOLTES at ALF.LET.UVA.NL  Mon Feb  8 06:19:00 1993
From: SCHOLTES at ALF.LET.UVA.NL (SCHOLTES@ALF.LET.UVA.NL)
Date: 08 Feb 1993 12:19 +0100 (MET)
Subject: PhD Dissertation Available
Message-ID: <346B17ED606070C5@VAX1.SARA.NL>

===================================================================

                Ph.D. DISSERTATION AVAILABLE

                           on

Neural Networks, Natural Language Processing, Information Retrieval

                292 pages and over 350 references

===================================================================

A Copy of the dissertation "Neural Networks in Natural Language Processing
and Information Retrieval" by Johannes C. Scholtes can be obtained for
cost price and fast airmail- delivery at US$ 25,-.

Payment by Major Creditcards (VISA, AMEX, MC, Diners) is accepted and
encouraged. Please include Name on Card, Number and Exp. Date. Your Credit
card will be charged for Dfl. 47,50.

Within Europe one can also send a Euro-Cheque for Dfl. 47,50 to:

    University of Amsterdam
    J.C. Scholtes
    Dufaystraat 1
    1075 GR Amsterdam
    The Netherlands

Do not forget to mention a surface shipping address. Please allow 2-4
weeks for delivery.


                            Abstract

1.0  Machine Intelligence

For over fifty years the two main directions in machine intelligence (MI),
neural networks (NN) and artificial intelligence (AI), have been studied by
various persons with many different backgrounds. NN and AI seemed to
conflict with many of the traditional sciences as well as with each other.
The lack of a long research history and well defined foundations has always
been an obstacle for the general acceptance of machine intelligence by
other fields.

At the same time, traditional schools of science such as mathematics and
physics developed their own tradition of new or "intelligent" algorithms.
Progress made in the field of statistical reestimation techniques such as
the Hidden Markov Models (HMM) started a new phase in speech recognition.
Another application of the progress of mathematics can be found in the
application of the Kalman filter in the interpretation of sonar and radar
signals. Much more examples of such "intelligent" algorithms can be found
in the statistical classification en filtering techniques of the study of
pattern recognition (PR).


Here, the field of neural networks is studied with that of pattern
recognition in mind.  Although only global qualitative comparisons are
made, the importance of the relation between them is not to be
underestimated. In addition it is argued that neural networks do indeed add
something to the fields of MI and PR, instead of competing or conflicting
with them.

2.0  Natural Language Processing

The study of natural language processing (NLP) exists even longer than that
of MI.  Already in the beginning of this century people tried to analyse
human language with machines. However, serious efforts had to wait until
the development of the digital computer in the 1940s, and even then, the
possibilities were limited. For over 40 years, symbolic AI has been the
most important approach in the study of NLP. That this has not always been
the case, may be concluded from the early work on NLP by Harris. As a
matter of fact, Chomsky's Syntactic Structures was an attack on the lack of
structural properties in the mathematical methods used in those days. But,
as the latter's work remained the standard in NLP, the former has been
forgotten completely until recently. As the scientific community in NLP
devoted all its attention to the symbolic AI-like theories, the only useful
practical implementation of NLP systems were those that were based on
statistics rather than on linguistics. As a result, more and more
scientists are redirecting their attention towards the statistical
techniques available in NLP. The field of connectionist NLP can be
considered as a special case of these mathematical methods in NLP.

More than one reason can be given to explain this turn in approach. On the
one hand, many problems in NLP have never been addressed properly by
symbolic AI. Some examples are robust behavior in noisy environments,
disambiguation driven by different kinds of knowledge, commensense
generalizations, and learning (or training) abilities.  On the other hand,
mathematical methods have become much stronger and more sensitive to
specific properties of language such as hierarchical structures.

Last but not least, the relatively high degree of success of mathematical
techniques in commercial NLP systems might have set the trend towards the
implementation of simple, but straightforward algorithms.

In this study, the implementation of hierarchical structures and semantical
features in mathematical objects such as vectors and matrices is given much
attention. These vectors can then be used in models such as neural
networks, but also in sequential statistical procedures implementing
similar characteristics.

3.0  Information Retrieval

The study of information retrieval (IR) was traditionally related to
libraries on the one hand and military applications on the other. However,
as PC's grew more popular, most common users loose track of the data they
produced over the last couple of years.  This, together with the
introduction of various "small platform" computer programs made the field
of IR relevant to ordinary users.

However, most of these systems still use techniques that have been
developed over thirty years ago and that implement nothing more than a
global surface analysis of the textual (layout) properties.  No deep
structure whatsoever, is incorporated in the decision whether or not to
retrieve a text.

There is one large dilemma in IR research. On the one hand, the data
collections are so incredibly large, that any method other than a global
surface analysis would fail.  On the other hand, such a global analysis
could never implement a contextually sensitive method to restrict the
number of possible candidates returned by the retrieval system.  As a
result, all methods that use some linguistic knowledge exist only in
laboratories and not in the real world. Conversely, all methods that are
used in the real world are based on technological achievements from twenty
to thirty years ago.

Therefore, the field of information retrieval would be greatly indebted to
a method that could incorporate more context without slowing down. As
computers are only capable of processing numbers within reasonable time
limits, such a method should be based on vectors of numbers rather than on
symbol manipulations. This is exactly where the challenge is: on the one
hand keep up the speed, and on the other hand incorporate more context.  If
possible, the data representation of the contextual information must not be
restricted to a single type of media. It should be possible to incorporate
symbolic language as well as sound, pictures and video concurrently in the
retrieval phase, although one does not know exactly how yet...

Here, the emphasis is more on real-time filtering of large amounts of
dynamic data than on document retrieval from large (static) data bases. By
incorporating more contextual information, it should be possible to
implement a model that can process large amounts of unstructured text
without providing the end-user with an overkill of information.

4.0  The Combination

As this study is a very multi-disciplinary one, the risk exists that it
remains restricted to a surface discussion of many different problems
without analyzing one in depth. To avoid this, some central themes,
applications and tools are chosen. The themes in this work are
self-organization, distributed data representations and context. The
applications are NLP and IR, the tools are (variants of) Kohonen feature
maps, a well known model from neural network research.

Self-organization and context are more related to each other than one may
suspect. First, without the proper natural context, self-organization shall
not be possible. Next, self-organization enables one to discover contextual
relations that were not known before.

Distributed data representation may solve many of the unsolved problems in
NLP and IR by introducing a powerful and efficient knowledge integration
and generalization tool.  However, distributed data representation and
self-organization trigger new problems that should be solved in an elegant
manner.

Both NLP and IR work on symbolic language. Both have properties in common
but both focus on different features of language. In NLP hierarchical
structures and semantical features are important. In IR the amount of data
sets the limitations of the methods used.  However, as computers grow more
powerful and the data sets get larger and larger, both approaches get more
and more common ground. By using the same models on both applications, a
better understanding of both may be obtained.

Both neural networks and statistics would be able to implement
self-organization, distrib- uted data and context in the same manner. In
this thesis, the emphasis is on Kohonen feature maps rather than on
statistics. However, it may be possible to implement many of the techniques
used with regular sequential mathematical algorithms.

So, the true aim of this work can be formulated as the understanding of
self-organization, distributed data representation, and context in NLP and
IR, by in depth analysis of Kohonen feature maps.


==============================================================================


From george at psychmips.york.ac.uk  Mon Feb  8 08:32:38 1993
From: george at psychmips.york.ac.uk (George Bolt)
Date: Mon, 8 Feb 93 13:32:38 +0000 (GMT)
Subject: Does backprop need the derivative ??
Message-ID: <m0nLYbG-00026sC@psychmips.york.ac.uk>


Heini Withagen wrote:
In his paper, 'An Empirical Study of Learning Speed in Back-Propagation
Networks', Scott E. Fahlmann shows that with the encoder/decoder problem
it is possible to replace the derivative of the transfer function by
a constant. I have been able to reproduce this example. However, for
several other examples, it was not possible to get the network
converged using a constant for the derivative.


- end quote -

I've looked at BP learning in MLP's w.r.t. fault tolerance and found 
that the derivative of the transfer function is used to *stop* learning.
Once a unit's weights for some particular input (to that unit rather than
the network) are sufficiently developed for it to decide whether to output
0 or 1, then weight changes are approximately zero due to this derivative.
I would imagine that by setting it to a constant, then a MLP will over-
learn certain patterns and be unable to converge to a state of equilibrium,
i.e. all patterns are matched to some degree.

A better route would be to set the derivative function to a constant
over a range [-r,+r], where f[r] -
(sorry) f( |r| ) -> 1.0. To make individual units robust with respect
to weights, make r=c.a where f( |a| ) -> 1.0 and c is a small constant
multiplicative value.

- George Bolt

University of York, U.K.


From movellan at cogsci.UCSD.EDU  Mon Feb  8 20:33:19 1993
From: movellan at cogsci.UCSD.EDU (Javier Movellan)
Date: Mon, 8 Feb 93 17:33:19 PST
Subject: Does backprop need the derivative ??
In-Reply-To: Marwan Jabri's message of Sat, 6 Feb 1993 23:49:53 +1100 <9302061249.AA17234@sedal.sedal.su.OZ.AU>
Message-ID: <9302090133.AA16068@cogsci.UCSD.EDU>


My experience with Boltzmann machines and GRAIN/diffusion networks
(the continuous stochastic version of the Boltzmann machine) has been
that replacing the real gradient by its sign times a constant
accelerates learning DRAMATICALLY. I first saw this technique in one
of the original CMU tech reports on the Boltzmann machine. I believe
Peterson and Hartman and Peterson and Anderson also used this
technique, which they called "Manhattan updating", with the
deterministic Mean Field learning algorithm. I believe they had an
article in "Complex Systems" comparing Backprop and Mean-Field with
both with standard gradient descent and with Manhattan updating. 

It is my understanding that the Mean-Field/Boltzmann chip developed at
Bellcore uses "Manhattan Updating" as its default training method.
Josh Allspector is the person to contact about this.

At this point I've tried 4 different learning algorithms with
continuous and discrete stochastic networks and in all cases Manhattan
Updating worked better than straight gradient descent.The question is
why Manhattan updating works so well (at least in stochastic and
Mean-Field networks) ?

 One possible interpreation is that Manhattan updating limits the
influence of outliers and thus it performs something similar to robust
regression. Another interpretation is that Manhattan updating avoids
the saturation regions, where the error space becomes almost
flat in some dimensions, slowing down learning. 

One of the disadvantages of Manhattan updating is that sometimes one
needs to reduce the weight change constant at the end of learning. But
sometimes we also do this in standard gradient descent anyway.


           -Javier


From oby%firenze%venezia.ROCKEFELLER.EDU at ROCKVAX.ROCKEFELLER.EDU  Mon Feb  8 20:42:08 1993
From: oby%firenze%venezia.ROCKEFELLER.EDU at ROCKVAX.ROCKEFELLER.EDU (Klaus Obermayer)
Date: Mon, 8 Feb 93 20:42:08 -0500
Subject: No subject
Message-ID: <9302090142.AA01612@firenze>

The following article is available as a (hardcopy) preprint:

Obermayer K. and Blasdel G.G. (1993), Geometry of Orientation and 
Ocular Dominance Columns in Monkey Striate Cortex, J. Neurosci., 
in press.

Abstract:

In addition to showing that ocular dominance is organized in slabs 
and that orientation preferences are organized in linear sequences 
likely to reflect slabs, Hubel and Wiesel (1974) discussed the 
intriguing possibility that slabs of orientation might intersect 
slabs of ocular dominance at some consistent angle. Advances in 
optical imaging now make it possible to test this possibility 
directly. When maps of orientation are analyzed quantitatively, 
they appear to arise from a combination of at least two competing 
themes: one where orientation preferences change linearly along 
straight axes, remaining constant along perpendicular axes and 
forming iso-orientation slabs along the way, and one where 
orientation preferences change continuously along circular axes, 
remaining constant along radial axes and forming singularities at 
the centers of the spaces enclosed. When orientation patterns are 
compared with ocular dominance patterns from the same cortical 
regions, quantitative measures reveal: 1) that singularities tend 
to lie at the centers of ocular dominance columns, 2) that linear 
zones (arising where orientation preferences change along straight 
axes) tend to lie at the edges of ocular dominance columns, and 3) 
that the short iso-orientation bands within each linear zone tend 
to intersect the borders of ocular dominance slabs at angles of 
approximately 90$^o$. 

-----------------------------------------------------------------

The original article contains color figures which - for cost 
reasons - have to be reproduced black and white. If you would like
to obtain a copy, please send your surface mail address to:

                       Klaus Obermayer 
                 The Rockefeller University
                 oby at rockvax.rockefeller.edu

-----------------------------------------------------------------


From thgoh at iss.nus.sg  Tue Feb  9 01:05:52 1993
From: thgoh at iss.nus.sg (Goh Tiong Hwee)
Date: Tue, 9 Feb 1993 14:05:52 +0800 (WST)
Subject: Does Backprop need the derivative
Message-ID: <9302090605.AA08961@iss.nus.sg>


From fellous%hyla.usc.edu at usc.edu  Wed Feb 10 21:48:50 1993
From: fellous%hyla.usc.edu at usc.edu (Jean-Marc Fellous)
Date: Wed, 10 Feb 93 18:48:50 PST
Subject: CNE / USC Workshop Reminder and Update.
Message-ID: <9302110248.AA01295@hyla.usc.edu>


Thank you for posting the following final announcement:


*********************** Last Reminder and Update ************************

                        SCHEMAS AND NEURAL NETWORKS
 		   INTEGRATING SYMBOLIC AND SUBSYMBOLIC
                  APPROACHES TO COOPERATIVE COMPUTATION


   A Workshop sponsored by the

                      Center for Neural Engineering 
                    University of  Southern  California
                         Los Angeles, CA 90089-2520

                          April 13th and 14th, 1993

   Program  Committee:  Michael  Arbib  (Organizer),  John  Barnden,
   George  Bekey,  Francisco  Cervantes-Perez,  Damian  Lyons,  Paul
   Rosenbloom, Ron Sun, Akinori Yonezawa


   A previous announcement (reproduced below) announced a  registra-
   tion  fee of $150 and advertised the availability of hotel accom-
   modation at $70/night.

   To encourage the participation of qualified students we have made
   3 changes:

   1) We have appointed Jean-Marc Fellous as Student Chair  for  the
   meeting to coordinate the active involvement of such students.

   2) We offer a Student Registration Fee of only  $40  to  students
   whose  application is accompanied by a letter from their supervi-
   sor attesting to their student status.

   3) Mr. Fellous has identified a number of lower-cost housing  op-
   tions, and will respond to queries to fellous at rana.usc.edu
   The original announcement - with updated registration form - fol-
   lows:

   To design complex technological systems and  to  analyze  complex
   biological and cognitive systems, we need a multilevel methodolo-
   gy which combines a coarse-grain analysis of  cooperative or dis-
   tributed  computation  (we shall refer to the computing agents at
   this level as "schemas") with a  fine-grain  model  of  flexible,
   adaptive  computation (for which neural networks provide a power-
   ful general paradigm).  Schemas provide  a language  for  distri-
   buted  artificial  intelligence,  perceptual  robotics, cognitive
   modeling, and brain theory which is "in the style of the  brain",
   but  at a relatively high level of abstraction relative to neural
   networks.

   The proposed workshop will provide a 2-hour introductory tutorial
   and  problem statement by Michael Arbib, and sessions in which an
   invited paper will be followed  by  several  contributed  papers,
   selected  from  those  submitted in response to this call for pa-
   pers.  Preference will be given to papers which present practical
   examples  of,  theory  of,  and/or methodology for the design and
   analysis of complex systems in which the overall specification or
   analysis is conducted in terms of schemas, and where some but not
   necessarily all of the schemas are  implemented  in  neural  net-
   works.

   A list of sample topics for contributions is as follows, where  a
   hybrid  approach  means one in which the abstract schema level is
   integrated with neural or other lower level models:

        Schema Theory as a description language for
        neural networks
        Modular neural networks
        Linking DAI to Neural Networks to Hybrid
        Architecture
        Formal Theories of Schemas        
        Hybrid approaches to integrating planning &
        reaction
        Hybrid approaches to learning
        Hybrid approaches to commonsense reasoning by
        integrating neural networks and rule-
        based reasoning (using schema for the
        integration)
        Programming Languages for Schemas and Neural
        Networks
        Concurrent Object-Oriented Programming for
        Distributed AI and Neural Networks
        Schema Theory Applied in Cognitive Psychology,
        Linguistics, Robotics, AI and Neuroscience


   Prospective contributors should send a hard copy of  a  five-page
   extended  abstract,   including figures with informative captions
   and full references (either by regular mail or fax)  by  February
   15,  1993   to:

            Michael  Arbib,  
     Center  for  Neural Engineering
   University of Southern California
      Los  Angeles,  CA  90089-2520
                    USA 
    
     Tel:    (213)    740-9220
     Fax:    (213)    746-2863
     arbib at pollux.usc.edu]  

   Please include your full address, including fax and email, on the
   paper.

   Notification of acceptance or rejection will be sent by email  no
   later  than March 1, 1993.  There are currently no plans to issue
   a formal proceedings of full papers, but revised versions of  ac-
   cepted abstracts received prior to April 1, 1993 will be collect-
   ed with the full text of the Tutorial in a CNE  Technical  Report
   which  will  be made available to registrants at the start of the
   meeting.   [A useful way to structure  such  an  abstract  is  in
   short  numbered sections, where each section presents (in a small
   type face!) the material corresponding to one  transparency/slide
   in  a  verbal presentation.  This will make it  easy for an audi-
   ence to take notes if they have a copy of the  abstract  at  your
   presentation.]

   Hotel Information:  Attendees may register at the hotel of  their
   choice,  but  the  closest hotel to USC is the University Hilton,
   3540 South Figueroa Street, Los Angeles, CA 90007, Phone:   (213)
   748-  4141,  Reservation:  (800) 872-1104, Fax:  (213) 748- 0043.
   A  single  room  costs  $70/night  while  a  double  room   costs
   $75/night.   Workshop  participants  must  specify  that they are
   "Schemas and Neural Networks Workshop" attendees to avail of  the
   above  rates.    Information  on student accommodation may be ob-
   tained   from   the    Student    Chair,    Jean-Marc    Fellous,
   fellous at rana.usc.edu.

   The registration fee of $150 ($40 for qualified students who  in-
   clude  a  "certificate of student status" from their advisor) in-
   cludes a copy of the abstracts, coffee breaks, and a dinner to be
   held on the evening of April 13th.

   Those wishing to register should send a check payable to  "Center
   for Neural Engineering, USC" for $150 ($40 for students) together
   with the following information to:

           Paulina Tagle
   Center  for  Neural Engineering
   University of Southern California
          University Park
     Los  Angeles,  CA  90089-2520
                    USA 
 

----------------------------------------------------------

              SCHEMAS AND NEURAL NETWORKS 
           Center for  Neural  Engineering  
                        USC
               April 13 - 14, 1993


   NAME:  ___________________________________________

   ADDRESS: _________________________________________

   PHONE NO.: _______________ FAX:___________________

   EMAIL: ___________________________________________


   I intend to submit a paper: YES  [   ]      NO   [   ]


From ljubomir at darwin.bu.edu  Wed Feb 10 21:12:30 1993
From: ljubomir at darwin.bu.edu (Ljubomir Buturovic)
Date: Wed, 10 Feb 93 21:12:30 -0500
Subject: Does backprop need the derivative?
Message-ID: <9302110212.AA07255@darwin.bu.edu>

Marwan Jabri:

> Regarding the idea of Simplex that has been suggested. The inquirer was
> talking about on-chip learning. Have you in your experiments done a
> limited precision Simplex? Have you tried it on a chip in in-loop mode?
> Philip Leong here has tried a similar idea (I think) a while back.  The
> problem with this approach is that you need to a have a very good guess at
> your starting point as the Simplex will move you from one vertex (feasible
> solution) to another while expanding the weight solution space. 
> Philip's experience is that it does work for small problems when you have
> a good guess!

No, we did not try limited precision Simplex, since the method has
another serious limitation, which is memory complexity. So there is
no point performing such refined studies until this problem is 
resolved, let alone on-chip implementation. The biggest problem we
tried it on succesfully was 11-dimensional (i. e., input samples were
11-dimensional vectors). The initial guess was pseudo-random, like
in back-propagation. In another, 12-dimensional example, it did not
do well (neither did back-prop, but Simplex was much worse), so it
might be true that it needs a good starting point.   

Ljubomir Buturovic
Boston University 
BioMolecular Engineering Research Center
36 Cummington Street, 3rd Floor
Boston, MA 02215

office: 617-353-7123
home:   617-738-6487 


From mozer at dendrite.cs.colorado.edu  Thu Feb 11 23:47:27 1993
From: mozer at dendrite.cs.colorado.edu (Michael C. Mozer)
Date: Thu, 11 Feb 1993 21:47:27 -0700
Subject: Preprint:  Neural net architectures for temporal sequence processing
Message-ID: <199302120447.AA06812@neuron.cs.colorado.edu>

-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-
		      PLEASE DO NOT POST TO OTHER BOARDS 
-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-


Neural net architectures for temporal sequence processing

Michael C. Mozer
Department of Computer Science 
University of Colorado


I present a general taxonomy of neural net architectures for processing
time-varying patterns.  This taxonomy subsumes many existing architectures in
the literature, and points to several promising architectures that have yet
to be examined.  Any architecture that processes time-varying patterns
requires two conceptually distinct components:  a short-term memory that holds
on to relevant past events and an associator that uses the short-term memory
to classify or predict.  The taxonomy is based on a characterization of
short-term memory models along the dimensions of form, content, and
adaptability.  Experiments on predicting future values of a financial time
series (US dollar-Swiss franc exchange rates) are presented using several
alternative memory models.  The results of these experiments serve as a
baseline against which more sophisticated architectures can be compared.


To appear in:  A. S. Weigend & N. A. Gershenfeld (Eds.), _Predicting the future
and understanding the past_.  Redwood City, CA: Addison-Wesley.  Spring 1993.


-.--.---.----.-----.------.-------.--------.-------.------.-----.----.---.--.-

To retrieve:

   unix> ftp archive.cis.ohio-state.edu
   Name: anonymous 
   230 Guest ogin ok, access restrictions apply.
   ftp> cd pub/neuroprose
   ftp> binary
   ftp> get mozer.architectures.ps.Z
   200 PORT command successful.
   ftp> quit
   unix> zcat mozer.architectures.ps.Z | lpr

Warning:  May not print on wimpy laser printers.


From kolen-j at cis.ohio-state.edu  Tue Feb  9 07:51:53 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 07:51:53 -0500
Subject: Does backprop need the derivative?
In-Reply-To: Radford Neal's message of Sun, 7 Feb 1993 12:24:15 -0500 <93Feb7.122429edt.227@neuron.ai.toronto.edu>
Message-ID: <9302091251.AA27813@pons.cis.ohio-state.edu>

The sign of the derivative is always positive ( remember o(1-o) and 0<o<1).
What is important is the general shape of derivative:  maximumal at 0.5,
minimal at the extremes. I have replaced the derivative with other
functions ( d(x)=c, d(x)=cx, d(x)=min(x,1-x), d(x)=x(1-x)+c ) and bp works
when there is a bump.

John Kolen


From darken at learning.siemens.com  Tue Feb  9 08:19:42 1993
From: darken at learning.siemens.com (Christian Darken)
Date: Tue, 9 Feb 93 08:19:42 EST
Subject: Does backprop need the derivative?
Message-ID: <9302091319.AA14794@learning.siemens.com>


>Other posters have discussed, regarding backprop...  
>
>> ... the question whether the derivative can be replaced by a constant, 
>
>To clarify, I believe the intent is that the "constant" have the same
>sign as the derivative, but have constant magnitude.


I haven't been following this thread, but the following reference
may be helpful to those that are.

Blum (Annals of Math. Statistics vol. 25 1954 p.385) shows that if the
"constant magnitude" is going to zero (so that the system is convergent)
the convergence is not to a minimum of the expected error (this is usually
what we want backprop to do), but to a minimum of the *median* of the error.


Chris Darken
darken at learning.scr.siemens.com


From munro at lis.pitt.edu  Thu Feb 11 11:14:44 1993
From: munro at lis.pitt.edu (fac paul munro)
Date: Thu, 11 Feb 93 11:14:44 EST
Subject: Summary of "Does backprop need the derivative ??"
In-Reply-To: Mail from 'Heini Withagen <heiniw@sun1.eeb.ele.tue.nl>'
      dated: Tue, 9 Feb 1993 11:46:06 +0100 (MET)
Message-ID: <9302111614.AA15497@icarus.lis.pitt.edu>


Forgive the review of college math, but there are a few issues, while
obvious to many, might be worth reviewing here...

[1] The gradient of a well-behaved single-valued function 
    of N variables (here the error as a function of the
    weights) is generally orthogonal to an N-1 dimensional
    manifold on which the function is constant (an iso-error 
    surface)

[2] The effect of infinitesimal motion in the space on the
    function can be computed as the inner (dot) product of
    the gradient vector with the movement vector; thus,
    as long as the dot product between the gradient and the
    delta-w vector is negative, the error will decrease.
    That is, the new iso-error surface will correspond to a lower
    error value.

[3] This implies that the signs of the errors is adequate to reduce
    the error, assuming the learning rate is sufficiently small,
    since any two vectors with all components the same sign
    must have a positive inner product! [They lie in the same
    orthant of the space]

Having said all this, I must point out that the argument pertains
only to single patterns.  That is, eliminating the derivative term,
is guaranteed to reduce the error for the pattern that is presented.

Its effect on the error summed over the training set is not 
guaranteed, even for batch learning...  

One more caveat: Of course, if the nonlinear part of the units'
transfer function is non-monotonic (i.e., the sign of the
derivative varies), be sure to throw the derivative back in!

- Paul Munro


From dhw at t13.Lanl.GOV  Thu Feb 11 17:19:13 1993
From: dhw at t13.Lanl.GOV (David Wolpert)
Date: Thu, 11 Feb 93 15:19:13 MST
Subject: new paper
Message-ID: <9302112219.AA23017@t13.lanl.gov>


***************************************************************
        DO NOT FORWARD TO OTHER BOARDS OR LISTS
***************************************************************


The following paper has been placed in neuroprose, under the name
wolpert.nips92.ps.Z. It is a major revision of an earlier preprint
on the same topic. An abbreviated version (2 fewer pages) will
appear in the proceedings of NIPS 92.


0N THE USE OF EVIDENCE IN NEURAL NETWORKS.


David H. Wolpert, Santa Fe Institute


Abstract: The Bayesian evidence approximation, which is closely related
to generalized maximum likelihood, has recently been employed to
determine the noise and weight-penalty terms for training neural nets.
This paper shows that it is far simpler to perform the exact calculation
than it is to set up the evidence approximation. Moreover, unlike that
approximation, the exact result does not have to be re-calculated for
every new data set. Nor does it require the running of complex
numerical computer code (the exact result is closed form). In addition,
it turns out that for neural nets, the evidence procedure's MAP
estimate is *in toto* approximation error. Another advantage of the exact
analysis is that it does not lead to incorrect intuition, like the claim
that one can "evaluate different priors in light of the data". This paper
ends by discussing sufficiency conditions for the evidence approximation
to hold, along with the implications of those conditions. Although couched
in terms of neural nets, the analysis of this paper holds for any Bayesian
interpolation problem.


Recover the file in the usual way:

unix> ftp cheops.cis.ohio-state.edu
Connected to cheops.cis.ohio-state.edu.
220 cheops.cis.ohio-state.edu FTP server ready.
Name: anonymous
331 Guest login ok, send ident as password.
Password: {your address}
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get wolpert.nips92.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for rosenblatt.reborn.ps.Z
226 Transfer complete.
100000 bytes sent in 3.14159 seconds
ftp> quit
221 Goodbye
unix> uncompress wolpert.nips92.ps.Z
unix> lpr wolpert.nips92.ps (or however you print postscript


From mozer at dendrite.cs.colorado.edu  Fri Feb 12 00:10:05 1993
From: mozer at dendrite.cs.colorado.edu (Michael C. Mozer)
Date: Thu, 11 Feb 1993 22:10:05 -0700
Subject: connectionist models summer school -- final call for applications
Message-ID: <199302120510.AA06977@neuron.cs.colorado.edu>

                        FINAL CALL FOR APPLICATIONS

                    CONNECTIONIST MODELS SUMMER SCHOOL

     The University of  Colorado  will  host  the  1993  Connectionist
     Models  Summer  School from June 21 to July 3, 1993.  The purpose
     of the summer school is to provide training  to  promising  young
     researchers  in connectionism (neural networks) by leaders of the
     field and to foster interdisciplinary collaboration.   This  will
     be  the  fourth  such  program  in  a  series  that  was  held at
     Carnegie-Mellon in 1986 and 1988 and at UC  San  Diego  in  1990.
     Previous  summer  schools  have  been extremely successful and we
     look forward to the 1993 session  with  anticipation  of  another
     exciting event.

     The  summer  school  will  offer  courses  in   many   areas   of
     connectionist modeling, with emphasis on artificial intelligence,
     cognitive neuroscience, cognitive science, computational methods,
     and  theoretical  foundations.   Visiting  faculty  (see  list of
     invited faculty below) will present daily lectures and tutorials,
     coordinate  informal workshops, and lead small discussion groups.
     The summer school schedule is designed to allow  for  significant
     interaction  among  students and faculty. As in previous years, a
     proceedings of the summer school will be published.

     Applications will  be  considered  only  from  graduate  students
     currently  enrolled in Ph.D. programs.  About 50 students will be
     accepted.  Admission is on a competitive basis.  Tuition will  be
     covered  for  all  students,  and  we expect to have scholarships
     available to subsidize housing and meal costs, but  students  are
     responsible for their own travel arrangements.

     Applications should include the following materials:

     *  a vita, including mailing address,  phone  number,  electronic
     mail  address,  academic  history, list of publications (if any),
     and relevant courses taken with  instructors'  names  and  grades
     received;

     *  a one-page statement of purpose,  explaining  major  areas  of
     interest  and  prior  background  in  connectionist  modeling and
     neural networks;

     *  two letters of recommendation from individuals  familiar  with
     the  applicants'  work  (either  mailed  separately  or in sealed
     envelopes); and

     *  a statement from the applicant describing potential sources of
     financial  support  available  (department,  advisor,  etc.)  for
     travel expenses.

     Applications should be sent to:

             Connectionist Models Summer School
             c/o Institute of Cognitive Science
             Campus Box 344
             University of Colorado
             Boulder, CO 80309

     All application materials must be  received  by  March  1,  1993.
     Admission  decisions  will  be announced around April 15.  If you
     have specific questions, please write to  the  address  above  or
     send  e-mail  to  "cmss at cs.colorado.edu".   Application materials
     cannot be accepted via e-mail.


     Organizing Committee

     Jeff Elman (UC San Diego)
     Mike Mozer (University of Colorado)
     Paul Smolensky (University of Colorado)
     Dave Touretzky (Carnegie Mellon)
     Andreas Weigend (Xerox PARC and University of Colorado)

     Additional faculty will include:

     Yaser Abu-Mostafa (Cal Tech)
     Sue Becker (McMaster University)
     Andy Barto (University of Massachusetts, Amherst)
     Jack Cowan (University of Chicago)
     Peter Dayan (Salk Institute)
     Mary Hare (Birkbeck College)
     Cathy Harris (Boston University)
     David Haussler (UC Santa Cruz)
     Geoff Hinton (University of Toronto)
     Mike Jordan (MIT)
     John Kruschke (Indiana University)
     Jay McClelland (Carnegie Mellon)
     Ennio Mingolla (Boston University)
     Steve Nowlan (Salk Institute)
     Dave Plaut (Carnegie Mellon)
     Jordan Pollack (Ohio State)
     Dean Pomerleau (Carnegie Mellon)
     Dave Rumelhart (Stanford)
     Patrice Simard (ATT Bell Labs)
     Terry Sejnowski (UC San Diego and Salk Institute)
     Sara Solla (ATT Bell Labs)
     Janet Wiles (University of Queensland)

     The Summer School is sponsored by the  American  Association  for
     Artificial Intelligence, the National Science Foundation, Siemens
     Research Center, and the  University  of  Colorado  Institute  of
     Cognitive Science.

     Colorado has recently passed a law explicitly denying  protection
     for  lesbians,  gays,  and bisexuals.  However, the Summer School
     does not discriminate in admissions on the  basis  of  age,  sex,
     race,  national  origin, religion, disability, veteran status, or
     sexual orientation.


From heiniw at sun1.eeb.ele.tue.nl  Tue Feb  9 05:46:06 1993
From: heiniw at sun1.eeb.ele.tue.nl (Heini Withagen)
Date: Tue, 9 Feb 1993 11:46:06 +0100 (MET)
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302091046.AA08161@sun1.eeb.ele.tue.nl>

A non-text attachment was scrubbed...
Name: not available
Type: text
Size: 191 bytes
Desc: not available
Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/7e2fa3eb/attachment-0001.ksh

From kolen-j at cis.ohio-state.edu  Tue Feb  9 08:46:53 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 08:46:53 -0500
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@sef-pmax.slisp.cs.cmu.edu's message of Sun, 07 Feb 93 12:56:03 EST <9302091257.AA06456@everest.eng.ohio-state.edu>
Message-ID: <9302091346.AA28166@pons.cis.ohio-state.edu>

   From: Scott_Fahlman at sef-pmax.slisp.cs.cmu.edu

   Of course, a learning system implemented in analog hardware might have only
   a few bits of accuracy due to noise and nonlinearity in the circuits, but
   it wouldn't suffer from this quantization effect, since you get a sort of
   probabilistic dithering for free.

This assumes, of course, that the mechanism is actually "computing" using
the available bits.  Bits are the result of binary measurements.  An analog
device does not normally convert voltages or currents into a binary
representation and then operate on it.  An analog mechanism sloppilly
implementing backprop should be able to tweak the weights in the general
direction, but not necessarily the same direction as theoretical backprop.

John Kolen


From KRUSCHKE at ucs.indiana.edu  Tue Feb  9 09:45:45 1993
From: KRUSCHKE at ucs.indiana.edu (John K. Kruschke)
Date: Tue, 9 Feb 93 09:45:45 EST
Subject: postdoctoral traineeships available
Message-ID: <mailman.579.1149591275.29955.connectionists@cs.cmu.edu>


POST-DOCTORAL FELLOWSHIPS AT INDIANA UNIVERSITY 

   Postdoctoral Traineeships in MODELING OF COGNITIVE PROCESSES 

   Please call this notice to the attention of all interested parties.

   The Psychology Department and Cognitive Science Programs at Indiana
University are pleased to announce the availability of one or more
Postdoctoral Traineeships in the area of Modeling of Cognitive
Processes. The appointment will pay rates appropriate for a new PhD
(about $18,800), and will be for one year, starting after July 1,
1993. The duration could be extended to two years if a training grant
from NIH is funded as anticipated (we should receive final
notification by May 1). 

   Post-docs are offered to qualified individuals who wish to further
their training in mathematical modeling or computer simulation
modeling, in any substantive area of cognitive psychology or Cognitive
Science. 

   We are particularly interested in applicants with strong
mathematical, scientific, and research credentials. Indiana University
has superb computational and research facilities, and faculty with
outstanding credentials in this area of research, including Richard
Shiffrin and James Townsend, co-directors of the training program, and
Robert Nosofsky, Donald Robinson, John Castellan, John Kruschke,
Robert Goldstone, Geoffrey Bingham, and Robert Port. 

   Trainees will be expected to carry out original theoretical and
empirical research in association with one or more of these faculty
and their laboratories, and to interact with other relevant faculty
and the other pre- and postdoctoral trainees. 

   Interested applicants should send an up to date vitae, personal
letter describing their specific research interests, relevant
background, goals, and career plans, and reference letters from two
individuals. Relevant reprints and preprints should also be sent.
Women, minority group members, and handicapped individuals are urged
to apply. PLEASE NOTE: The conditions of our anticipated grant
restrict awards to US citizens, or current green card holders. Awards
will also have a 'payback' provision, generally requiring awardees to
carry out research or teach for an equivalent period after termination
of the traineeship. Send all materials to: 

   Professors Richard Shiffrin and James Townsend, 
     Program Directors 
   Department of Psychology, Room 376B 
   Indiana University
   Bloomington, IN 47405 
		
   We may be contacted at: 
   812-855-2722; 
   Fax: 812-855-4691    
   email: shiffrin at ucs.indiana.edu 

Indiana University is an Affirmative Action Employer 


From kenm at prodigal.psych.rochester.edu  Tue Feb  9 10:50:49 1993
From: kenm at prodigal.psych.rochester.edu (Ken McRae)
Date: Tue, 9 Feb 93 10:50:49 EST
Subject: paper available
Message-ID: <9302091550.AA20269@prodigal.psych.rochester.edu>


The following paper is now available in pub/neuroprose.


   Catastrophic Interference is Eliminated in Pretrained Networks

                          Ken McRae
                   University of Rochester
                              &
                     Phil A. Hetherington
                       McGill University
 

When modeling strictly sequential experimental memory tasks, such as
serial list learning, connectionist networks appear to experience
excessive retroactive interference, known as catastrophic interference
(McCloskey & Cohen,1989; Ratcliff, 1990).  The main cause of this
interference is overlap among representations at the hidden unit layer
(French, 1991; Hetherington,1991; Murre, 1992).  This can be alleviated by
constraining the number of hidden units allocated to representing each
item, thus reducing overlap and interference (French, 1991; Kruschke,
1992).  When human subjects perform a laboratory memory experiment, they
arrive with a wealth of prior knowledge that is relevant to performing the
task.  If a network is given the benefit of relevant prior knowledge, the
representation of new items is constrained naturally, so that a sequential
task involving novel items can be performed with little interference. 
Three laboratory memory experiments (ABA free recall, serial list, and ABA
paired-associate learning) are used to show that little or no interference
is found in networks that have been pretrained with a simple and relevant
knowledge base.  Thus, catastrophic interference is eliminated when
critical aspects of simulations are made to be more analogous to the
corresponding human situation. 
 

Thanks again to Jordan Pollack for maintaining this electronic library.


An example of how to retrieve mcrae.pretrained.ps.Z:

your machine> ftp archive.cis.ohio-state.edu
Connected to archive.cis.ohio-state.edu.
220 archive FTP server (Version 6.15 Thu Apr 23 15:28:03 EDT 1992) ready.
Name (archive.cis.ohio-state.edu:kenm): anonymous
331 Guest login ok, send e-mail address as password.
Password:
230 Guest login ok, access restrictions apply.
ftp> cd pub/neuroprose
250-Please read the file README
250-  it was last modified on Mon Feb 17 15:51:43 1992 - 357 days ago
250-Please read the file README~
250-  it was last modified on Wed Feb  6 16:41:29 1991 - 733 days ago
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get mcrae.pretrained.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for mcrae.pretrained.ps.Z (129046 bytes).
226 Transfer complete.
local: mcrae.pretrained.ps.Z remote: mcrae.pretrained.ps.Z
129046 bytes received in 30 seconds (4.2 Kbytes/s)
ftp> quit
221 Goodbye.
your machine> uncompress mcrae.pretrained.ps.Z
your machine>  then print the file


From kolen-j at cis.ohio-state.edu  Tue Feb  9 13:31:43 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Tue, 9 Feb 93 13:31:43 -0500
Subject: Test & Derivatives in Backprop
Message-ID: <9302091831.AA00142@pons.cis.ohio-state.edu>

[I hope that this makes it to connectionists, the last couple of postings
 haven't made it back.  So I have summarized these replies in one message
 for general consumption.]

Regarding the latest talk about derivatives in backprop, I had looked into
replacing the different mathematical operations with other, more
implementation-amenable operations.  This included replacing the
derivative of the squashing function with d(x)=min(x,1-x).  The results of
these tests show that backprop is pretty stable as long as the qualitative
shape of the operations are maintained.  If you replace the derivative with
a constant or linear (wrt activation) function it doesn't work at all for
the learning tasks I considered.  As long as the derivative replacement is
minimal in the extreme activations and maximal at 0.5 (wrt the traditional
sigmoid), the operation will not suffer dramatically.  

After reading Fahlman's observation about loosing bits to noise I had the
following response.  Bits come from binary decisions.  Analog systems
don't do that in normal processing, normally some continuous value affects
another continuous value.  No where do they perform A/D conversion and then
operate on the bits.  If there is no measurement device, then talking about
bits doesn't make sense.

John Kolen


From guy at cs.uq.oz.au  Tue Feb  9 17:25:35 1993
From: guy at cs.uq.oz.au (guy@cs.uq.oz.au)
Date: Wed, 10 Feb 93 08:25:35 +1000
Subject: Does backprop need the derivative ??
Message-ID: <9302092225.AA06661@client>


The question has been asked whether the full derivative is needed for
backprop to work, or whether the sign of the derivative is sufficient. 

As far as I am aware, the discussion has not defined at what point the
derivative is truncated to +/-1. This might occur (1) for each input/output
pair when the error is fed into the output layer, (2) in epoch based
learning, the exact derivative of each weight over the training set might be
computed, but the update to the weight truncated, or (3...) many
intermediate cases.

I believe one problem with limited precision weights is as follows. The
magnitude of the update may be smaller than the limit of precision on the
weight (which has much greater magnitude). If the machine arithmetic then
rounds the updated weight to the nearest representable value, the updated
weight will be rounded to its old value, and no learning will occur. 

I am co-author of a technical report which addressed this problem. In our
algorithm, weights had very limited precision but their derivatives over the
whole training set were computed exactly. The weight update step would shift
the weight value to the next representable value with a probability
proportional to the size of the derivative. 

In our inexhaustive testing, we found that very limited precision weights
and activations could be used. 

The technical report is available in hardcopy (limited numbers) and
postscript. My addresses are "guy at cs.uq.oz.au" and "Guy Smith, Department of
Computer Science, The University of Queensland, St Lucia 4072, Australia".

Guy Smith. 


From meng at spring.kuee.kyoto-u.ac.jp  Wed Feb 10 11:58:19 1993
From: meng at spring.kuee.kyoto-u.ac.jp (meng@spring.kuee.kyoto-u.ac.jp)
Date: Wed, 10 Feb 93 11:58:19 JST
Subject: Does backprop need the derivative ?? 
In-Reply-To: Scott_Fahlman@SEF-PMAX.SLISP.CS.CMU.EDU's message of Sun, 07 Feb 93 13:02:42 EST <9302091925.AA12414@ntt-sh.ntt.jp>
          9 Feb 93 1:36:51 EST
          9 Feb 93 1:35:08 EST
          7 Feb 93 13:03:24 EST
Message-ID: <9302100258.AA20634@spring.kuee.kyoto-u.ac.jp>


Thinking about it, it seems that the derivative always can be replaced
by a sufficiently small constant. I.e., for a certain training set and
a certain requirement of precision on the ouput units, you can find a
constant that is smaller than a certain constant that, with the same
starting point, will find the same minimum for the same network as an
algorithm that is using the derivative. The problem with this of
course is that the constant may be so small that the training time
may be prohibitive, while the motivation to such a constant is to speed up
training. The reason that this works in a lot of instances is,
I think, that the requirement of precision is wide enough to let the
network jump into a region that is sufficiently close to a minimum.
A situation where it wouldn't work, would be a situation where the
network is moving in the right direction, but jumping too far, i.e.
jumping from one side of a valley to the other alternately, never landing
within a region that would give convergence within the requirements set.
The use of the derivative solves this by getting smaller when approaching
a minimum.

Another possibility is that using a constant the network might settle
in another minimum (or try to settle in another ("wider") minimum) by
virtue of "seeing" the error surface as more coarse grained than the
version using a derivative. In some cases, if you're lucky (i.e. has
a good initial state in relation to a minimum and the constant you're
using) you might hit bull's eye, with another initial state you might be
oscillating around the solution (i.e. having the error go up and down
without getting within the required limit). In such a case you could switch
to using the derivative or simply decrease the constant (maybe how much
could be computed on the basis of the increase in error? Just an idea).

These are just some thoughts on the subject, no empirical study undertaken.

Tore


From "MVUB::BLACK%hermes.mod.uk" at relay.MOD.UK  Fri Feb 12 09:50:00 1993
From: "MVUB::BLACK%hermes.mod.uk" at relay.MOD.UK (John V. Black @ DRA Malvern)
Date: Fri, 12 Feb 93  14:50 GMT
Subject: IEE Third International Conference on ANN's (Registration Announcement)
Message-ID: <mailman.580.1149591275.29955.connectionists@cs.cmu.edu>

CONFERENCE ANNOUNCEMENT
=======================
         
IEE Third International Conference on
 Artificial Neural Networks            

Brighton, UK, 25-27 May 1993.  
  
-----------------------------------------------------
This conference, organised by the Institute of Electrical Engineers
will cover up-to-date reports on the curent state of research
on Artificial Neural Networks, including theoretical understanding
of fundamental structures, learning algorithms, implementation and 
applications.
Over 70 papers willl  be presented in formal and poster sessions
under the following headings

APPLICATIONS                    ARCHITECTURES
VISION                          CONTROL & ROBOTICS
MEDICAL SYSTEMS                 NETWORK ANALYSIS

In addition there will be a small exhibition and publishers display,  
Civic Reception and Conference Dinner.  

Registration fees are as follows:

Member(IEE/associated societies)      235 pounds sterling (inc 35 pounds vat) 
Non-member                            294   "       "     (inc 43.79 "    ")
Research Student or Retired            83   "       "     (inc 12.36 "    ")


Further information including full programme available  from

Sheila Griffiths
ANN93 Secretariat
Conferemce Services
Institute of Electrical Engineeers
Savoy Place
London WC2R 0BL, UK
Tel: 071 344 5478/5477
Fax: 071 497 3633
Telex: 261776 IEE LDN G       


       John Black (jvb%hermes.mod.uk at relay.mod.uk)
			E-mailing for David Lowe


From kolen-j at cis.ohio-state.edu  Fri Feb 12 08:11:58 1993
From: kolen-j at cis.ohio-state.edu (john kolen)
Date: Fri, 12 Feb 93 08:11:58 -0500
Subject: Does backprop need the derivative ??
In-Reply-To: Mark Evans's message of Thu, 11 Feb 93 10:26:03 GMT <3468.9302111026@it-research-institute.brighton.ac.uk>
Message-ID: <9302121311.AA20446@pons.cis.ohio-state.edu>


When I used the term stable in my previous posting, I did not entail the
mathematical notion of stability when applied to a control system.  What I
meant was the apparent behavior of the network, learning a set of
associations of patterns, was unaffected by quantitative changes in these
operations.  An analogy I often use is the symbolic dynamics of unimodal
iterated function systems.  As long as small number of qualitative
conditions are true, then the system will exhibit the same symbol dynamics
as other functions for which the conditions hold regardless of the
numerical differences between functions.  Thus the bifurcation diagrams of
rx(1-x) and a bump made up of sigmoids will exhibit the same type of period
doubling cascaded.  

Even if it wasn't mathematically stable, but was guaranteed to pass through
a region of weight space with usable weights, most of the NN community
would find it useful.

John


From shim at marlin.nosc.mil  Fri Feb 12 13:00:08 1993
From: shim at marlin.nosc.mil (Randy L. Shimabukuro)
Date: Fri, 12 Feb 93 10:00:08 -0800
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302121800.AA01359@marlin.nosc.mil>

Congratulations on initiating a very lively discussion. From reading the
responses though, it appears that people are interpreting your question
differently. At the risk of adding to the confusion let me try to
explain.

It seems that some people are talking about the derivative of the
transfer function (F') and while others are talking about the gradient
of the error function. We have looked at both cases:

We approximate F' in a manner similar to that suggested by George Bolt.
Letting F'(|x|) -> 1 for |x|<r, and F'(|x|) -> a for |x|>=r. Where a is
a small positive constant, and r is a point where F'(r) is approximately
1.

We have also, in a sense, approximated the gradient of the error
function by quantizing the weight updates. This is similar to what
Peterson and Hartman call "Manhattan updating". In this case it is
important to preserve the sign of the derivative.

We have found that the first type of approximation has very little
effect of back propagation. Depending on the problem, the second type
sometimes shortens the learning time and sometimes prevents the network
from learning. In some cases it helps to decrease the size of the
updates as learning progresses.

                        Randy Shimabukuro


From hartman%pav.mcc.com at mcc.com  Sat Feb 13 17:36:04 1993
From: hartman%pav.mcc.com at mcc.com (E. Hartman)
Date: Sat, 13 Feb 93 16:36:04 CST
Subject: Re. does bp need the derivative?
Message-ID: <9302132236.AA01583@energy.pav.mcc.com>


Re. the question of the derivative in backprop, Javier Movellan 
and Randy Shimabukuro mentioned the "Manhattan updating" dicussed 
in Peterson and Hartman ("Explorations of the Mean Field Theory
Learning Algorithm", Neural Networks Vol.2 pp 475-494 1989).

This technique computes the gradient exactly, but then keeps only
the signs of the components and takes fixed-size weight steps (each 
weight is changed by a fixed amount, either up or down).  

We used this technique to advantage, both in backprop and mean field 
theory nets, on problems with inconsistent data -- data containing 
exemplars with identical inputs but differing outputs (one-to-many 
mapping).  (The problem in the paper was a classification problem 
drawn from overlapping gaussian distributions). 

The reason that this technique helped on this kind of problem is the 
following.  Since the data was highly inconsistent, we found that 
before taking a step in weight space, it helped to average out the 
data inconsistencies by accumulating the gradient over a large number 
of patterns (large batch training).  But, typically, it happens that 
some components of the gradient don't "average out" nicely and instead
became very large.  So the components of the gradient vary greatly 
in magnitude, which makes choosing a good learning rate difficult.  
"Manhattan updating" makes all the components equal in magnitude.  

We found it necessary to slowly reduce the step size as training proceeds.

Eric Hartman


From marwan at sedal.su.oz.au  Sat Feb 13 03:03:55 1993
From: marwan at sedal.su.oz.au (Marwan Jabri)
Date: Sat, 13 Feb 1993 19:03:55 +1100
Subject: Test & Derivatives in Backprop
Message-ID: <9302130803.AA03429@sedal.sedal.su.OZ.AU>

> From: john kolen <kolen-j at cis.ohio-state.edu>
> 
> [I hope that this makes it to connectionists, the last couple of postings
>  haven't made it back.  So I have summarized these replies in one message
>  for general consumption.]
> 
> Regarding the latest talk about derivatives in backprop, I had looked into
> replacing the different mathematical operations with other, more
> implementation-amenable operations.  This included replacing the
> derivative of the squashing function with d(x)=min(x,1-x).  The results of
> these tests show that backprop is pretty stable as long as the qualitative
> shape of the operations are maintained.  If you replace the derivative with
> a constant or linear (wrt activation) function it doesn't work at all for
> the learning tasks I considered.  As long as the derivative replacement is
> minimal in the extreme activations and maximal at 0.5 (wrt the traditional
> sigmoid), the operation will not suffer dramatically.  
> 
> After reading Fahlman's observation about loosing bits to noise I had the
> following response.  Bits come from binary decisions.  Analog systems
> don't do that in normal processing, normally some continuous value affects
> another continuous value.  No where do they perform A/D conversion and then
> operate on the bits.  If there is no measurement device, then talking about
> bits doesn't make sense.
> 
> John Kolen
> 

Are we talking about analog implementations? I hope so because I am.
If not, then forget this message.

The derivative issue boils down to whether you can implement cheaply,
whatever is the approximation. The implication on the training speed
depends on  how good your gradient approximations are.

The bit-width issue boils down to how you will implement your storage
(weights).  Whether you use analog EEPROM, RAM converted with DACs 
or whatever, you have to deal with bit effects. Except
if you have a new analog high precision storage device that can be implemented
cheaply, in which case I will be eager to learn about.

If you have the analog dream device, then your next problem in analog
implementation is the signal/noise ratio. Except if your analog circuits
are noisyless.

	Marwan

-------------------------------------------------------------------
Marwan Jabri			       Email: marwan at sedal.su.oz.au
Senior Lecturer				      Tel: (+61-2) 692-2240
SEDAL, Electrical Engineering,		      Fax:         660-1228
Sydney University, NSW 2006, Australia     Mobile: (+61-18) 259-086


From miller at picard.ads.com  Mon Feb 15 11:32:44 1993
From: miller at picard.ads.com (Kenyon Miller)
Date: Mon, 15 Feb 93 11:32:44 EST
Subject: Summary of "Does backprop need the derivative ??"
Message-ID: <9302151632.AA03270@picard.ads.com>


Paul Munro writes:

> [3] This implies that the signs of the errors is adequate to reduce
>     the error, assuming the learning rate is sufficiently small,
>     since any two vectors with all components the same sign
>     must have a positive inner product! [They lie in the same
>     orthant of the space]

I beleive a critical point is being missed, that is, the derivative
is being replaced by sign at every stage in applying the chain rule,
not just to the initial backpropagation of the error.  Consider the
following example:

      ----n2-----
     /           \
w--n1             n4
     \           /
      ----n3-----

In other words, there is an output neuron n4 which is connected to two
neurons n2 and n3, each of which is connected to neuron n1, which has
a weight w.  Suppose the weight connecting n2 to n4 is negative and all
other connections in the diagram are positive.  Suppose further that
n2 is saturated and none of the other neurons are saturated.  Now,
suppose that n4 must be decreased in order to reduce the error.
Backpropagating along the n4-n2-n1 path, w receives an error term
which would tend to increase n1, while backpropagating along the
n4-n3-n1 path would result in a term which would tend to decrease n1.
If the true sigmoid derivative were used, the force to increase n1
would be dampened because n2 is saturated, and the net result would
be to increase w and therefore increase n1 and n3 and decrease n4.
However, replacing the sigmoid derivative with a constant could easily
allow the n4-n2-n1 path to dominate, and the error at the output would
increase.  Thus, it is not a sound thing to do regardless of how
many patterns are used for training.


  -Ken Miller.


From kanal at cs.UMD.EDU  Mon Feb 15 12:35:27 1993
From: kanal at cs.UMD.EDU (Laveen N. Kanal)
Date: Mon, 15 Feb 93 12:35:27 -0500
Subject: non-Turing machines?
Message-ID: <9302151735.AA10355@mimsy.cs.UMD.EDU>


I have only tuned into part of the quantum computers discussion and so I don't
know if the following references have been mentioned in the discussion.
Having speculated about natural perception not being modelable by Turing
machines, I was not surprised to find similar speculation in the book
Renewing Philosophy by Hilary Putnam (Harvard Univ. Press, 1992) which I
picked up at the bookstore the other day. But Putnam does cite two specific
refrences which may be of interest in this context.

Marian Boykan Pour-El and Ian Richards, " The Wave Equation with  Computable
Initial Data Such That Its Unique Solution Is Not Computable," Advances in
Mathematics, 39 (1981) p. 215-239

Georg Kreisel's review of the above paper in The Journal of Symbolic Logic,
47, No. 4, (1982) p. 900-902.


From ala at sans.kth.se  Tue Feb 16 07:51:57 1993
From: ala at sans.kth.se (Anders Lansner)
Date: Tue, 16 Feb 1993 13:51:57 +0100
Subject: MCPA'93 Call for Contributions
Message-ID: <199302161251.AA02772@occipitalis.sans.kth.se>


MCPA'93  Final Call

****************************************************************************
*                         Invitation to                                    *
*       International Workshop on Mechatronical Computer Systems           *
*               for Perception and Action, June 1-3, 1993                  *
*                       Halmstad University,  Sweden                       *
*                                                                          *
*                     Final Call for Contributions                         *
**************************************************************************** 

Mechatronical Computer Systems that Perceive and Act - A New Generation
=======================================================================

Mechatronical computer systems, which we will see in 
advanced products and production equipment of tomorrow, 
are designed to do much more than calculate. The interaction 
with the environment and the integration of computational 
modules in every part of the equipment, engaging in every 
aspect of its functioning, put new, and conceptually different, 
demands on the computer system. A development towards a 
complete integration between the mechanical system, 
advanced sensors and actuators, and a multitude of process-
ing modules can be foreseen. At the systems level, powerful 
algorithms for perceptual integration, goal-direction and 
action planning in real time will be critical components. The 
resulting action-oriented systems may interact with their 
environments by means of sophisticated sensors and actua-
tors, often with a high degree of parallelism, and may be able 
to learn and adapt to different circumstances and environ-
ments. Perceiveing the objects and events of the external 
world and acting upon the situation in accordance with an 
appropriate behaviour, whether programmed, trained, or 
learned, are key functions of these, next generation, compu-
ter systems.

The aim of this first International Workshop on Mechatronical 
Computer Systems for Perception and Action is to gather 
researchers and industrial development engineers, who work 
with different aspects of this exciting new generation of com-
puting systems and computer-based applications, to a fruitful 
exchange of ideas and results and, often interdisciplinary, dis-
cussions. 

Workshop Form
=============

One of the days of the workshop will be devoted to true work-
shop activities. The objective is to identify and propose 
research directions and key problem areas in mechatronical 
computing systems for perception and action. In the morning 
session, invited speakers, as well as other workshop dele-
gates, will give their perspectives on the theme of the work-
shop. The work will proceed in smaller working groups during 
the afternoon, after which the conclusions will be presented in 
a plenary session.

The scientific programme will also include presentations of 
research results in oral or poster form, or as demonstrations.

Subject Areas
=============

Relevant subject areas are e.g.:

Real-Time Systems Architecture and Real-Time Software.

Sensor Systems and Sensory/Motor Coordination.

Biologically Inspired Systems.

Applications of Unsupervised and Reinforcement Learning.

Real-Time Decision Making and Action Planning.

Parallel Processor Architectures for Embedded Systems.

Development Tools and Support Systems for Mechatronical 
Computer Systems and Applications.

Dependable Computer Systems.

Robotics and Machine Vision.

Neural Networks in Real-Time Applications.

Advanced Mechatronical Computing Demands in Industry.

Contributions to the Workshop
=============================

The programme committee welcomes all kinds of contribu-
tions - papers to be presented orally or as posters, demon-
strations, etc. - in the areas listed above, as well as other 
areas of relevance to the theme of the workshop.

>From the workshop point of view, it is NOT essential that con-
tributions contain only new, unpublished results. Rather, the 
new, interdisciplinary collection of delegates that can be 
expected at the workshop may motivate presentations of ear-
lier published results.

Specifically, we invite delegates to state their view of the 
workshop theme, including identification of key research 
issues and research directions. The planning of the workshop 
day will be based on these submitted statements , some of 
which will be presented in the plenary session, some of which 
in the smaller working groups.

DEADLINES
=========

Febr. 26, 1993: Submissions of extended abstracts or full 
papers. Submissions of statements regarding perspectives 
on the conference theme, that the delegate would like to 
present at the workshop (4 pages max). Submissions of 
descriptions of demonstrations, etc.

March 19, 1993: Notification of acceptance. Preliminary final 
programme.

May 1, 1993: Final papers and statements.

All submissions shall be sent to the workshop secretariat, see 
address box. Please send two copies. Submissions must 
include name(s) and affiliation(s) of author(s) and full 
address, including phone and fax number and electronic mail 
address (if possible).

The accepted papers and statements will be assembled into 
a Proceedings book given to the Workshop attendees. After 
the workshop a revised version of the proceedings, including 
results of the workshop discussions, will be published by an 
international publisher.

Invited speakers
================

Prof. John A. Stankovic, University of Massachusetts, USA, 
and Scuola Superiore S. Anna, Pisa, Italy:

"Major Real-Time Challenges for Mechatronical Systems"

Prof. Jan-Olof Eklundh, CVAP, Royal Institute of Technology, 
Stockholm, Sweden:

"Computer Vision and Seeing Systems"

Prof. Dave Cliff, School of Cognitive and Computing Sciences
and Neuroscience IRC, University of Sussex, U.K.

"Animate Vision in an Artificial Fly: A Study in Computational Neuroethology" &
"Visual Sensory-Motor Networks Without Design: Evolving Visually Guided Robots"

(More invited speakers to be confirmed.)

ORGANISERS
==========

The workshop is arranged by CCA, the Centre for Computer 
Architecture at Halmstad University, Sweden, in cooperation 
with the DAMEK Mechatronics Research Group and the 
SANS (Studies of Artificial Neural Systems) Research Group, 
both at the Royal Institute of Technology (KTH), Stockholm, 
Sweden, and the Department of Computer Engineering, 
Chalmers University of Technology, Gothenburg, Sweden.

The Organising Committee includes:

Lars Bengtsson, CCA, Organising Chair

Anders Lansner, SANS

Kenneth Nilsson, CCA

Bertil Svensson, Chalmers University of Technology and 
CCA, Programme and Conference Chair

Per-Arne Wiberg, CCA

Jan Wikander, DAMEK

The workshop is supported by SNNS, the Swedish Neural 
Network Society.

It is financially supported by Halmstad University, the County 
Administration of Halland, Swedish industries and NUTEK 
(the Swedish National Board for Industrial and Technical 
Development).

Programme Committee
===================

Bertil Svensson, Sweden (chair)

Paolo Ancilotti, Italy

Lars Bengtsson, Sweden

Giorgio Buttazzo, Italy

Robert Forchheimer, Sweden

Anders Lansner, Sweden

Kenneth Nilsson, Sweden

John Stankovic, Italy and USA   

Jan Torin, Sweden

Hendrik van Brussel, Belgium

Per-Arne Wiberg, Sweden

Jan Wikander, Sweden

Workshop Language: English

Workshop fee: SEK 2 000, incl. proceedings, lunch-
eons, reception and workshop dinner. Early registration 
(before April 20) SEK 1750.

The number of attendees to the workshop is limited. Among 
those not submitting a contribution attendance will be given 
on a first-come, first-served basis.

Social Activities
=================

Reception, workshop dinner.

Deep sea fishing tour or a visit at Varberg castle/fortress 
and museum.

Bring your family, a programme for accompanying persons 
will be arranged.

How to get there
================

Halmstad is situated on the west coast of Sweden between 
Copenhagen and Gothenburg (major international airports). 
With a distance of 150 kilometres to each of these cities it is 
easy and convenient to reach Halmstad by train, bus or car. 
Halmstad Airport is linked to Stockholm International Airport 
(Arlanda). Flight time Stockholm - Halmstad is 50 minutes.

Accomodation
============

Arrangements will be made with local hotels, both downtown 
Halmstad and at the seaside. Different price categories will be 
available. Please let us know what price category and loca-
tion you prefer and we help you with the booking. Payment is 
made directly to the hotel. Prices (breakfast included) in SEK:

CATEGORY 1: SEK 750-850 single room, 750-950 double room

	Downtown	Single room ( )		Double room ( )

	Seaside		Single room ( )		Double room ( )

CATEGORY 2: SEK 400 single room, 450 double room

	Near town	Single room ( )		Double room ( )

Transportation between the hotels and the University will be 
arranged.


( )  I register already now. Send preliminary programme when available.

( )  I do not register yet but want the preliminary programme when available.


Name ...................................................    

......................................................	

Address.................................................

.......................................................	

.......................................................	

Tel., Fax, e-mail        ....................................

........................................................


-------------------------------------------------------------------------

MCPA Workshop
Centre for Computer Architecture
Halmstad University
Box 823
S-30118 HALMSTAD
Sweden

Tel. +46 35 153134 (Lars Bengtsson)
Fax. +46 35 157387
email: mcpa at cca.hh.se
------------------------------------------------------------------------

END OF MESSAGE


From harris at ai.mit.edu  Tue Feb 16 18:50:28 1993
From: harris at ai.mit.edu (John G. Harris)
Date: Tue, 16 Feb 93 18:50:28 EST
Subject: Postdoc position in computational/biological vision (learning)
Message-ID: <9302162350.AA05713@portofino>


One (or possibly two) postdoctoral positions are available for one or two
years in computational vision starting September 1993 (flexible).  The
postdoc will work in Lucia Vaina's laboratory at Boston University, College
of Engineering, to conduct research in learning the direction in global
motion.  The researchers currently involved in this project are Lucia M.
Vaina, John Harris, Charlie Chubb, Bob Sekuler, and Federico Girosi.

Requirements are PhD in CS or related area with experience in visual
modeling or psychophysics.  Knowledge of biologically relevant neural models
is desirable.  Stipend ranges from $28,000 to $35,000 depending upon
qualifications.  Deadline for application is March 1, 1993.  Two letter of
recommendation, description of current research and an up to date CV are
required.

In the research we combine computational psychophysics, neural networks
modeling and analog VLSI to study visual learning specifically applied to
direction in global motion. The global motion problem requires estimation of
the direction and magnitude of coherent motion in the presence of noise.  We
are proposing a set of psychophysical experiments in which the subject, or
the network must integrate noisy, spatially local motion information from
across the visual field in order to generate a response.  We will study the
classes of neural networks which best approximate the pattern of learning
demonstrated in psychophysical tasks. We will explore Hebbian learning,
multilayer perceptrons (e.g. backpropagation), cooperative networks, Radial
Basis Function and Hyper-Basis Functions. The various strategies and their
implementation will be evaluated on the basis of their performance and their
biological plausibility.

For more details, contact Prof. Lucia M. Vaina at vaina at buenga.bu.edu or
lmv at ai.mit.edu.


From learn at galaxy.huji.ac.il  Wed Feb 17 09:37:58 1993
From: learn at galaxy.huji.ac.il (learn conference)
Date: Wed, 17 Feb 93 16:37:58 +0200
Subject: Learning Days in Jerusalem
Message-ID: <9302171437.AA04425@galaxy.huji.ac.il>


========== DEADLINE FOR SUBMISSIONS:  March 1, 1993 ==========================


                     THE HEBREW UNIVERSITY OF JERUSALEM
                     THE CENTER FOR NEURAL COMPUTATION

                        LEARNING DAYS IN JERUSALEM
     Workshop on Fundamental Issues in Biological and Machine Learning

                         May 30 - June 4, 1993 
                  Hebrew University, Jerusalem, Israel

The Center for Neural Computation at the Hebrew University is a new multi-
disciplinary research center for collaborative investigation of the principles 
underlying computation and information processing in the brain and in neuron-
like artificial computing systems.  The Center's activities span theoretical 
studies of neural networks in physics, biology and computer science;
experimental investigations in neurophysiology, psychophysics and cognitive 
psychology; and applied research on software and hardware implementations. 

The first international symposium sponsored by the Center will be held in the 
spring of 1993, at the Hebrew University of Jerusalem.  It will focus on 
theoretical, experimental and practical aspects of learning in natural and 
artificial systems.

Topics for the meeting include:
Theoretical Issues in Supervised and Unsupervised Learning
Neurophysiological Mechanisms Underlying Learning
Cognitive Psychology and Learning Psychophysics
Applications of Machine and Neural Network Learning

Invited speakers include:
Moshe Abeles (Hebrew U.)                Yann LeCun (AT&T)
Aharon Agranat (Hebrew U.)              Joseph LeDoux (NYU) 
Ehud Ahissar (Weizmann Inst.)           Christoph von der Malsburg (U. Bochum)
Asher Cohen (Hebrew U.)                 Yishai Mansour (Tel Aviv U.) 
Yuval Davidor (Weizmann Inst.)          Bruce McNaughton (U. of Arizona)
Yadin Dudai (Weizmann Inst.)            Helge Ritter (U. Bielefeld)
Martha Farah (U. Penn) 			David Rumelhart (Stanford)
David Haussler (UCSC)                   Dov Sagi (Weizmann Inst.) 
Nathan Intrator (Tel Aviv U.)           Menachem Segal (Weizmann Inst.) 
Larry Jacoby (McMaster U.)              Alex Waibel (CMU, U. Karlsruhe) 
Michael Jordan (MIT)                    Norman Weinberger (U.C. Irvine)


Participation in the Workshop is limited to 100.
A small number of contributed papers will be accepted.
 
Interested researchers and students are asked to submit registration forms 
by **** March 1, 1993,***** to:
 
Sari Steinberg Bchiri                Tel:    (972) 2 584563
Center for Neural Computation        Fax:    (972) 2 584437
c/o Racah Institute of Physics       E-mail: learn at galaxy.huji.ac.il
Hebrew University
91904 Jerusalem, Israel

To ensure participation, please send a copy of the registration form by e-mail
or fax as soon as possible.

Organizing Committee: Shaul Hochstein, Haim Sompolinsky, Naftali Tishby.


--------------------------------------------------------------------------------

                              REGISTRATION FORM

Please complete the following form. To ensure participation, please send a copy 
of this form by e-mail or fax as soon as possible to:

Sari Steinberg Bchiri                     E-MAIL: learn at galaxy.huji.ac.il
Center for Neural Computation             TEL: 972-2-584563
c/o Racah Institute of Physics            FAX: 972-2-584437 
Hebrew University
91904 Jerusalem, Israel             Registration will be confirmed by e-mail.
                
                          
                            CONFERENCE REGISTRATION

Name: _________________________________________________________________________

Affiliation: __________________________________________________________________

Address: ______________________________________________________________________

City: __________________ State: ______________ Zip: _________ Country: ________

Telephone: (____)________________ E-mail address:  ____________________________


                              REGISTRATION FEES

____ Regular registration (before March 1): $100
____ Student registration (before March 1): $50
____ Late registration (after March 1): $150
____ Student late registration (after March 1): $75

Please send payment by check or international money order in US dollars made 
payable to: Learning Workshop with this form by March 1, 1993 to avoid late fee.


                               ACCOMMODATIONS 

If you are interested in assistance in reserving hotel accommodation for the 
duration of the Workshop, please indicate your preferences below:

I wish to reserve a single/double (circle one) room from __________ 
to __________, for a total of _______ nights.

                             CONTRIBUTED PAPERS

A very limited number of contributed papers will be accepted. Participants
interested in submitting papers should complete the following and enclose
a 250-word abstract.
Poster/Talk (circle one)

Title: __________________________________________________________________
       __________________________________________________________________


From kak at max.ee.lsu.edu  Wed Feb 17 13:26:28 1993
From: kak at max.ee.lsu.edu (Dr. S. Kak)
Date: Wed, 17 Feb 93 12:26:28 CST
Subject: Reprints
Message-ID: <9302171826.AA05612@max.ee.lsu.edu>


Reprints of the following article are now available:

----------------------------------------------------------------
Ciruits, Systems, & Signal Processing, vol. 12, 1993, pp. 263-278
----------------------------------------------------------------
Feedback Neural Networks: New Characteristics and a Generalization

Subhash C. Kak

Department of Electrical and Computer Engineering
Louisiana State University, Baton Rouge, LA 70803, USA

ABSTRACT

New characteristics of feedback neural networks are studied.
We discuss in detail the question of updating of neurons
given incomplete information about the state of the neural network.
We show how the mechanism of self-indexing 
[Self-indexing of neural
memories, Physics Letters A, Vol. 143, 293-296, 1990.]
for such updating provides better results than assigning 'don't know'
values to the missing parts of the state vector.
Issues related to the choice of the neural model
for a feedback network are also considered.
Properties of a new complex valued neuron model that generalizes
McCulloch-Pitts neurons are examined.
-----
Note: This issue of the journal is devoted exclusively to
articles on neural networks.


From radford at cs.toronto.edu  Wed Feb 17 15:15:58 1993
From: radford at cs.toronto.edu (Radford Neal)
Date: Wed, 17 Feb 1993 15:15:58 -0500
Subject: Paper on "A new view of the EM algorithm"
Message-ID: <93Feb17.151609edt.555@neuron.ai.toronto.edu>


The following paper has been placed in the neuroprose archive, as the
file 'neal.em.ps.Z':


             A NEW VIEW OF THE EM ALGORITHM THAT JUSTIFIES 
                    INCREMENTAL AND OTHER VARIANTS
  
                Radford M. Neal and Geoffrey E. Hinton

                    Department of Computer Science 
                         University of Toronto 

  We present a new view of the EM algorithm for maximum likelihood
  estimation in situations with unobserved variables.  In this view,
  both the E and the M steps of the algorithm are seen as maximizing a
  joint function of the model parameters and of the distribution over
  unobserved variables.  From this perspective, it is easy to justify an
  incremental variant of the algorithm in which the distribution for
  only one of the unobserved variables is recalculated in each E step.
  This variant is shown empirically to give faster convergence in a
  mixture estimation problem.  A wide range of other variant algorithms
  are also seen to be possible.


The PostScript for this paper may be retrieved in the usual fashion:

  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get neal.em.ps.Z
  ftp> quit
  unix> uncompress neal.em.ps.Z
  unix> lpr neal.em.ps (or however you print PostScript files)

Many thanks to Jordan Pollack for providing this service!

  Radford Neal


From gem at cogsci.indiana.edu  Thu Feb 18 09:03:23 1993
From: gem at cogsci.indiana.edu (Gary McGraw)
Date: Thu, 18 Feb 93 09:03:23 EST
Subject: Letter Spirit technical report available
Message-ID: <mailman.581.1149591275.29955.connectionists@cs.cmu.edu>

  The following technical report from the Center for Research on
Concepts and Cognition is available by ftp (only).  Although the
project described in the paper is not connectionism per se, it shares
many of the same philosophical convictions.
----------------------------------------------------------------------
			  Letter Spirit: 
	  An Emergent Model of the Perception and Creation 
		        of Alphabetic Style

	          Douglas Hofstadter & Gary McGraw

The Letter Spirit project explores the creative act of artistic
letter-design.  The aim is to model how the $26$ lowercase letters of
the roman alphabet can be rendered in many different but internally
coherent styles.  Viewed from a distance, the behavior of the program
can be seen to result from the interaction of four emergent agents
working together to form a coherent style and to design a complete
alphabet: the Imaginer (which plays with the concepts behind
letterforms), the Drafter (which converts ideas for letterforms into
graphical realizations), the Examiner (which combines bottom-up and
top-down processing to perceive and categorize letterforms), and the
Adjudicator (which perceives and dynamically builds a representation
of the evolving style).  Creating a gridfont is an iterative process
of guesswork and evaluation carried out by the four agents.  This
process is the ``central feedback loop of creativity''.
Implementation of Letter Spirit is just beginning.  This paper
outlines our goals and plans for the project.
---------------------------------------------------------------------------
The paper is available by anonymous ftp from:
cogsci.indiana.edu  (129.79.238.12)  
		      as pub/hofstadter+mcgraw.letter-spirit.ps.Z
and in neuroprose:
archive.cis.ohio-state.edu (128.146.8.52)
                      as pub/neuroprose/hofstadter.letter-spirit.ps.Z

  Unfortunately, we are not able to distribute hardcopy at this time.  
*---------------------------------------------------------------------------*
|   Gary McGraw           gem at cogsci.indiana.edu   |              (__)      |
|--------------------------------------------------|              (oo)      |
|  Center for Research on Concepts and Cognition   |       /-------\/       |
|  Department of Computer Science                  |      / |     ||        |
|  Indiana University                              |     *  ||----||        |
|                      mcgrawg at moose.indiana.edu   |        ^^    ^^        |
*---------------------------------------------------------------------------*


From mwitten at hermes.chpc.utexas.edu  Thu Feb 18 10:00:41 1993
From: mwitten at hermes.chpc.utexas.edu (mwitten@hermes.chpc.utexas.edu)
Date: Thu, 18 Feb 93 9:00:41 CST
Subject: Computational Neurosciences Workshop
Message-ID: <9302181500.AA03619@morpheus.chpc.utexas.edu>


    ***********************************************************************
    **                                                                   **
    **  UNIVERSITY OF TEXAS SYSTEM CENTER FOR HIGH PERFORMANCE COMPUTING **
    **	                                                                 **
    ** Workshop  Series  In  Computational  Medicine  And  Public  Health**
    **                                                                   **
    **  	           Announces                                     **
    **                                                                   **
    **        A Workshop On Computational Neurosciences                  **
    **									 **
    **			14-15 May 1993                                   **
    **        								 **
    **                   Austin, Texas                                   **     
    **									 **	
    ***********************************************************************


Workshop Director:
-----------------

	Dr. Matthew Witten
	Associate Director,
	University of Texas System - CHPC
	Balcones Research Center
	10100 Burnet Road, CMS 1.154
	Austin, TX 78758-4497 USA
	Phone: (512) 471-2472 or (800) 262-2472
	Fax  : (512) 471-2445
	email: m.witten at chpc.utexas.edu 
	       m.witten at uthermes.bitnet

                   ***** Peliminary Program *****

List Of Current Speakers:
-------------------------

Dr. Peter Fox, Director Research Imaging Center, UT HSC San Antonio

Dr. Terry Mikiten, Associate Dean, Grad School of Biomedical Sciences, UT HSC
		San Antonio

Dr. Robert Wyatt, Director, Institute For Theoretical Chemistry, UT Austin

Dr. Elizabeth Thomas, Department of Chemistry, UT Austin

Dr. George Adomian, Director, General Analytics Corporation, Athens, Georgia

Dr. George Moore, Department of Biomedical Engineering, University of
		Southern California, Los Angeles, CA

Dr. William Softky, California Institute of Technology, Pasadena, CA

Dr. Cathy Wu, Department of Biomathematics and Computer Science, UT Health
		Center, Tyler, TX

Dr. Dan Levine, Department of Mathematics, University of Texas at Arlington,
		Arlington, TX

Dr. Michael Liebman, Senior Scientist, Amoco Technology Company, Naperville,
		Illinois

Dr. George Stanford, Learning Abilities Center, UT Austin

Dr. Tom Oakland, School of Education, UT Austin

Dr. Matthew Witten, Associate Director, UT System - CHPC


Objective, Agenda and Participants:
----------------------------------

The 1990's have been declared the Decade of the Mind. Understanding the
mind requires the understanding of a wide variety of topics in 
the neurosciences.

This Workshop is part of an ongoing series of workshops being held at
the UT System Center For High Performance Computing; addressing issues
of high performance computing and its role in medicine, dentistry,
allied health disciplines, and public health. Prior workshops have
covered Computational Chemistry and Molecular Design, and Computational
Issues in the Life Sciences and Medicine. Upcoming workshops will
focus on the subject areas of Computational Molecular Biology and Genetics,
Biomechanics, and Physiological Modeling and Simulation. 

The purpose of this Workshop On Computational Neurosciences
is to bring together interested scientists
for the purposes of introducing them to state-of-the-art thinking and
applications in the domain of neuroscience. Topics to be discussed range
across the disciplines of neurosimulation, cognitive neuroscience, neural
nets and their theory/application to a variety of problems, methods for
solving numerical problems arising in neurology, learning
abilities and disabilities, and neurological imaging.

Lectures will be presented in a tutorial fashion, and time for questions
and answers will be allowed.

Attendence is open to anyone. A  background in the neurosciences is
not required.  The size of the workshop is limited due to
seating constraints. It is best to register as soon as possible.

Schedule:
--------

14 May 1993 - Friday

 8:00am -  9:00am 	Registration and Refreshments
 9:00am -  9:15am       Opening Remarks - Dr. James C. Almond, Director,
				UT System CHPC
 9:15am - 10:00am       Conference Overview - Dr. Matthew Witten
10:00am - 11:00am	Dr. Peter Fox
11:00am - 11:30am	Coffee Break
11:30am - 12:30pm       Dr. Dan Levine
12:30pm -  1:30pm	Lunch Break
 1:30pm -  2:30pm       Dr. Michael Liebman
 2:30pm -  3:30pm 	Dr. Cathy Wu
 3:30pm -  4:00pm	Coffee Break
 4:00pm -  5:00pm	Dr. Terry Mikiten


15 May 1993 - Saturday

 8:00am -  9:00am	Registration and Refreshments
 9:00am - 10:00am	Dr. George Moore
10:00am - 11:00am	Dr. Robert Wyatt and Dr. Elizabeth Thomas
11:00am - 11:30am	Coffee Break
11:30am - 12:30pm	Dr. George Adomian
12:30am -  1:30pm	Lunch Break
 1:30am -  2:30pm 	Dr. George Stanford and Dr. Tom Oakland
 2:30am -  3:30pm	Dr. William Softky
 3:30pm -  4:00pm       Coffee Break
 4:00pm -  5:00pm	Closing Discussion and Remarks

Poster Sessions:
----------------

While no poster sessions are planned, if enough conference 
participants indicate a desire to present a poster, we will
make every attempt to accommodate the requests. If you are interested
in presenting a poster presentation at this meeting, please contact
the workshop director.

Conference Proceedings:
----------------------

We will make every attempt to have a publication quality
conference proceedings. All of the speakers have been asked to
submit a paper covering the talk material. The proceedings will
appear as a special issue of the series Advances In Mathematics And
Computers In Medicine, which is part of the International Journal
of Computers and Mathematics With Applications (Pergamon Press).
Individuals wishing to have an appropriate paper included in this
proceedings should contact the workshop director for manuscript 
details and deadlines.


Conference Costs and Funding:
-----------------------------

A nominal registration fee of US $50.00 will be charged by 1 April 93, and
US $60.00 after that date. The conference proceedings will be an additional
US $10.00 . The conference registration fee includes luncheon and 
refreshments for both days of the workshop.


Accomodations:
-------------

There are a number of very reasonable hotels near the UT System CHPC.
Additional information may be obtained by contacting the workshop coordinator
at the address below.


Registration and Information:
----------------------------

Registration requests and further questions should be directed to:

Ms. Leslie Bockoven
Administrative Associate
Workshop On Computational NeuroSciences
UT System - CHPC
Balcones Research Center
10100 Burnet Road, CMS 1.154
Austin, TX 78758-4497
Phone: (512) 471-2472 or (800) 262-2472
Fax  : (512) 471-2445
Email: neuro93 at chpc.utexas.edu
       neuro93 at uthermes.bitnet

      ============ REGISTRATION FORM FOLLOWS - CUT HERE ==========

NAME (As will appear on badge):

AFFILIATION (As will appear on badge):

ADDRESS:


PHONE:

FAX  :

EMAIL:


Please answer the following questions as appropriate:

Do you wish to purchase a copy of the conference proceedings?
	If yes, make sure to include the proceedings purchase fee.


Do you have any special dietary requirements?
	If yes, what are they?


Do you wish to present a poster?
	If yes, what will the proposed title be?


Do you wish to include a manuscript in the conference proceedings?
	If yes, what will the proposed topic be?


Do you wish to be on our Workshop Series mailing list?
	If yes, please give the address for announcements (email is okay)


Do you need a hotel reservation?


Do you anticipate needing local transportation?


====================  END OF REGISTRATION FORM ============================


From gary at psyche.mit.edu  Wed Feb 17 18:42:21 1993
From: gary at psyche.mit.edu (Gary Marcus)
Date: Wed, 17 Feb 93 18:42:21 EST
Subject: MIT Center for Cognitive Science Occasional Paper #47
Message-ID: <9302172342.AA04329@psyche.mit.edu>

Would you please post the following announcement? Thank you very much.
Sincerely,
Gary Marcus
----
The following technical report is now available:

		   MIT CENTER FOR COGNITIVE SCIENCE
			 OCCASIONAL PAPER #47

			
	German Inflection: The Exception that Proves the Rule

			    Gary F. Marcus
				 MIT

			   Ursula Brinkmann
	      Max-Planck-Institut fuer Psycholinguistik
				   
			    Harald Clahsen
			    Richard Wiese
			    Andreas Woest
		 Universit at act[c]t D at act[y]sseldorf.
				   
			    Steven Pinker
				 MIT
				   
			       ABSTRACT


Language is often explained by generative rules and a memorized lexicon. For
example, most English verbs take a regular past tense suffix
(ask-asked), which is applied to new verbs (faxed, wugged),
suggesting the mental rule "add -d to a Verb."  Irregular verbs
(break-broke, go-went) would be listed in memory.  Connectionists
argue instead that a pattern associator memory can store and
generalize all past tense forms; irregular and regular patterns differ
only because of their different numbers of verbs. We present evidence
that mental rules are indispensible. A rule concatenates a suffix to a
symbol for verbs, so it does not require access to memorized verbs
or their sounds, but applies as the "default," whenever memory access
fails.  We find 20 such circumstances, including novel,
unusual-sounding, and derived words; in every case, people inflect
them regularly (explaining quirks like flied out, sabre-tooths,
walkmans). Contrary to connectionist accounts, these effects are not
due to regular words being in the majority. The German participle
-t and plural -s apply to minorities of words. Two experiments
eliciting ratings of novel German words show that the affixes behave like
their English counterparts, as defaults.  Thus default suffixation is
not due to numerous regular words reinforcing a pattern in associative
memory, but to a memory-independent, symbol-concatenating mental
operation. 

---------------------------------------------------------------------------
Copies of the postscript file german.ps.Z may be obtained
electronically from psyche.mit.edu as follows:

unix-1> ftp psyche.mit.edu        (or ftp 18.88.0.85)
Connected to psyche.mit.edu.
Name (psyche:): anonymous
331 Guest login ok, sent ident as password.
Password: yourname
230 Guest login ok, access restrictions apply.
ftp> cd pub
250 CWD command successful.
ftp> binary
200 Type set to I.
ftp> get german.ps.Z
200 PORT command successful.
150 Opening data connection for german.ps.Z (18.88.0.154,1500) (253471 bytes).
226 Transfer complete.
local: german.ps.Z remote: german.ps.Z
166433 bytes received in 4.2 seconds (39 Kbytes/s)
ftp> quit
unix-2> uncompress german.ps.Z
unix-3> lpr -P(your_local_postscript_printer) german.ps

Or, order a hardcopy by sending your physical mail address to Eleanor
Bonsaint (bonsaint at psyche.mit.edu), asking for Occasional Paper #47,
Please do this only if you cannot use the ftp method described above.


From josh at faline.bellcore.com  Thu Feb 18 10:59:52 1993
From: josh at faline.bellcore.com (Joshua Alspector)
Date: Thu, 18 Feb 93 10:59:52 EST
Subject: Workshop on applications of neural networks to telecommunications
Message-ID: <9302181559.AA02043@faline.bellcore.com>

CALL FOR PAPERS

International Workshop on Applications of 
Neural Networks to Telecommunications

Princeton, NJ
October 18-20, 1993

You are invited to submit a paper to an international workshop on applications 
of neural networks to problems in telecommunications.
The workshop will be held in Princeton, New Jersey on October, 18-20 1993.

This workshop will bring together active researchers in neural networks with 
potential users in the telecommunications industry in a forum for discussion 
of applications issues. Applications will be identified, experiences shared,
and directions for future work explored.

Suggested Topics:
Application of Neural Networks in:

Network Management
Congestion Control
Adaptive Equalization
Speech Recognition
Security Verification
Language ID/Translation
Information Filtering
Dynamic Routing
Software Reliability
Fraud Detection
Financial and Market Prediction
Adaptive User Interfaces
Fault Identification and Prediction
Character Recognition
Adaptive Control
Data Compression

Please submit 6 copies of both a 50 word abstract and a 1000 word summary 
of your paper by May 14, 1993. Mail papers to the conference administrator:

Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com


Abstract and Summary Due: May 14
Author Notification of Acceptance: June 18
Camera-Ready Copy of Paper Due: August 13

Organizing Committee:

General Chair
Josh Alspector
Bellcore, MRE 2P-396
445 South St.
Morristown, NJ 07960-6438
(201) 829-4342
josh at bellcore.com

Program Chair
Rod Goodman
Caltech 116-81
Pasadena, CA 91125
(818) 356-3677
rogo at micro.caltech.edu

Publications Chair
Timothy X Brown
Bellcore, MRE 2E-378
445 South St.
Morristown, NJ 07960-6438
(201) 829-4314
timxb at faline.bellcore.com

Treasurer
Anthony Jayakumar, Bellcore

Events Coordinator
Larry Jackel, AT&T Bell Laboratories 

University Liaison
S Y Kung, Princeton

INNS Liaison
Bernie Widrow, Stanford University

IEEE Liaison
Steve Weinstein, Bellcore

Industry Liaisons
Miklos Boda, Ellemtel
Atul Chhabra, NYNEX
Michael Gell, British Telecom
Lee Giles, NEC
Thomas John, Southwest Bell
Adam Kowalczyk, Telecom Australia

Conference Administrator
Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com

-----------------------------------------------------------------------------
-----------------------------------------------------------------------------

International Workshop on Applications of 
Neural Networks to Telecommunications
Princeton, NJ
October 18-20, 1993

Registration Form

Name: _____________________________________________________________

Institution: __________________________________________________________

Mailing Address:
___________________________________________________________________

___________________________________________________________________

___________________________________________________________________

___________________________________________________________________

Telephone: ______________________________

Fax: ____________________________________

E-mail: _____________________________________________________________


I will attend | | 

Send more information | |

Paper enclosed  | | 

Registration Fee Enclosed ($350) | | 
(please make sure your name is on the check)

Registration includes Monday night reception, Tuesday night banquet,
and proceedings available at the conference.

Mail to:
Betty Greer
Bellcore, MRE 2P-295
445 South St.
Morristown, NJ 07960
(201) 829-4993
(fax) 829-5888
bg1 at faline.bellcore.com

Deadline for submissions: May 14, 1993
Author Notification of Acceptance: June 18, 1993
Camera-Ready Copy of Paper Due: August 13, 1993


From miller at picard.ads.com  Thu Feb 18 11:51:18 1993
From: miller at picard.ads.com (Kenyon Miller)
Date: Thu, 18 Feb 93 11:51:18 EST
Subject: correction to backprop example
Message-ID: <9302181651.AA02454@picard.ads.com>


For those of you who have lost interest in the backprop debate about
replacing the sigmoid derivative with a constant, please disregard
this message.

It was recently pointed out to me that my backprop example was incomplete
(I don't know the name of the sender):

> The error need not be increased although w increased because W1-3 decreased
> and W3-4 decreased. With 2 decreases and 1 increase, one could still expect
> the N4 to decrease and also the error.
> Rgds,
> TH

My original example (with typographical corrections) was:

Consider the following example:

      ----n2-----
     /           \
w--n1             n4
     \           /
      ----n3-----

In other words, there is an output neuron n4 which is connected to two
neurons n2 and n3, each of which is connected to neuron n1, which has
a weight w.  Suppose the weight connecting n2 to n4 is negative and all
other connections in the diagram are positive.  Suppose further that
n2 is saturated and none of the other neurons are saturated.  Now,
suppose that n4 must be decreased in order to reduce the error.
Backpropagating along the n4-n2-n1 path, w receives an error term
which would tend to increase n1, while backpropagating along the
n4-n3-n1 path would result in a term which would tend to decrease n1.
If the true sigmoid derivative were used, the force to increase n1
would be dampened because n2 is saturated, and the net result would
be to decrease w and therefore decrease n1, n3, n4, and the error.
However, replacing the sigmoid derivative with a constant could easily
allow the n4-n2-n1 path to dominate, and the error at the output would
increase. 

The conclusion was that replacing the sigmoid derivative with a constant
can result in increasing the error, and is therefore undesireable.


CORRECTION TO THE EXAMPLE:


The original example did not take into account the perturbation on
W1-3 and W3-4, but the argument still holds with the following modification.
Whatever the perturbation on W1-3 and W3-4, there exists (or at least
a situation can be constructed such that there exists) some positive 
perturbation on w which will counteract those perturbations and result in an 
increase in the output error.  Now replicate the n1-n2-n4 path as necessary
by adding an n1-n5-n4 path, an n1-n6-n4 path etc.  Each new path 
results in incrementing w by some constant delta, so there must exist
some number of paths which results in a sufficient increase in w
to cause an increase in the output error of the network.  Thus, an
example can be constructed in which the error increases, so the method
cannot be considered theoretically sound.


However, you can get virtually all of the benefit without any of the 
theoretical problems by using the derivative of the piecewise-linear function

               -------------------
              /
            /
          /
---------

which involves using a constant or zero for the derivative, depending 
on a simple range test.

  -Ken Miller.


From georgiou at silicon.csci.csusb.edu  Thu Feb 18 13:04:55 1993
From: georgiou at silicon.csci.csusb.edu (George M. Georgiou)
Date: Thu, 18 Feb 1993 10:04:55 -0800
Subject: Multivalued and Continuous Perceptrons (Preprint)
Message-ID: <9302181804.AA24680@silicon.csci.csusb.edu>

Rosenblatt's Percepceptron Theorem guaranties us that a linearly
separable function (R^n --> {0,1}) can be learned in finite time.  

Question: Is it possible to guarantee learning of a continuous-valued
          function (R^n --> (0,1)) which can be represented on a
          perceptron in finite time? 

This paper answers this question (and other ones too) in the
affirmative:

	      The Multivalued and Continuous Perceptrons
				  by
			  George M. Georgiou

  Rosenblatt's perceptron is extended to (1) a multivalued
  perceptron and (2) to a continuous-valued perceptron.  It shown that
  any function that can be represented by the multivalued perceptron
  can be learned in a finite number of steps, and any function that
  can be represented by the continuous perceptron can be learned with
  arbitrary accuracy in a finite number of steps.  The whole apparatus
  is defined in the complex domain. With these perceptrons
  learnability is extended to more complicated functions than the
  usual linearly separable ones. The complex domain promises to
  be a fertile ground for neural networks research.

The file in the neuroprose is georgiou.perceptrons.ps.Z .


Comments and questions on the proofs are welcome.
---------------------------------------------------------------------
Sample session to get the file:

  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get georgiou.perceptrons.ps.Z
  ftp> quit
  unix> uncompress georgiou.perceptrons.ps.Z
  unix> lpr georgiou.perceptrons.ps (or however you print PostScript files)

 Thanks to Jordan Pollack for providing this service!

--George
----------------------------------------------------
Dr. George M. Georgiou                    E-mail: georgiou at wiley.csusb.edu
Computer Science Department                  TEL: (909) 880-5332
California State University	             FAX: (909) 880-7004
5500 University Pkwy
San Bernardino, CA 92407, USA


From rangarajan-anand at CS.YALE.EDU  Thu Feb 18 13:18:40 1993
From: rangarajan-anand at CS.YALE.EDU (Anand Rangarajan)
Date: Thu, 18 Feb 1993 13:18:40 -0500
Subject: No subject
Message-ID: <199302181818.AA24890@COMPOSITION.SYSTEMSZ.CS.YALE.EDU>

			Programmer/Analyst Position
			in Artificial Neural Networks

			The Yale Center for Theoretical
			and Applied Neuroscience (CTAN)
                                 and the
			Department of Computer Science
			Yale University, New Haven, CT
			
We are offering a challenging position in software engineering in support of 
new techniques in image processing and computer vision using artificial neural
networks (ANNs).

1. Basic Function:
Designer and programmer for computer vision and neural network
software at CTAN and the Computer Science department.

2. Major duties:
(a) To implement computer vision algorithms using a Khoros (or similar) 
type of environment.

(b) Use the aforementioned tools and environment to run and analyze 
computer experiments in specific image processing and vision application 
areas.

(c) To facilitate the improvement of neural network algorithms and 
architectures for vision and image processing.

3. Position Specifications:
(a) Education: 
	BA, including linear algebra, differential equations, calculus.
	helpful: mathematical optimization.

(b) Experience:
	programming experience in C++ (or C) under UNIX.
	some of the following: neural networks,	vision or image processing 
	applications, scientific computing, workstation graphics,
	image processing environments, parallel computing, computer algebra
	and object-oriented design.

Preferred starting date: March 1, 1993.

For information or to submit an application, please write:

Eric Mjolsness
Department of Computer Science
Yale University
P. O. Box 2158 Yale Station
New Haven, CT 06520-2158
e-mail: mjolsness-eric at cs.yale.edu
			
Any application must also be submitted to:

Jeffrey Drexler
Department of Human Resources
Yale University
155 Whitney Ave.
New Haven, CT 06520

-Eric Mjolsness and Anand Rangarajan
 (prospective supervisors)


From pjs at bvd.Jpl.Nasa.Gov  Thu Feb 18 14:49:36 1993
From: pjs at bvd.Jpl.Nasa.Gov (Padhraic Smyth)
Date: Thu, 18 Feb 93 11:49:36 PST
Subject: Position Available at JPL
Message-ID: <9302181949.AA26236@bvd.jpl.nasa.gov>


 We currently have an opening in our group for a new PhD graduate
 in the general area of signal processing and pattern recognition.
 While the job description does not mention neural computation per
 se, it may be of interest to some members of the connectionist
 mailing list. For details see below.

 Padhraic Smyth, JPL


                     RESEARCH POSITION AVAILABLE
                              AT THE
                      JET PROPULSION LABORATORY,
                 CALIFORNIA INSTITUTE OF TECHNOLOGY


 The Communications Systems Research Section at JPL has an immediate
 opening for a permanent member of technical staff in the area of
 adaptive signal processing and statistical pattern recognition.

 The position requires a PhD in Electrical Engineering or a closely
 related field and applicants should have a demonstrated ability
 to perform independent research.

 A background in statistical signal processing is highly desirable.
 Background in information theory, estimation and detection, advanced
 statistical methods, and pattern recognition, would also be a plus.

 Current projects within the group include the use of hidden Markov
 models for change detection in time series, and statistical methods
 for geologic feature detection in remotely sensed image data. The
 successful applicant will be expected to perform both basic and
 applied research and to propose and initiate new research projects.

 Permanent residency or U.S. citizenship is not a strict requirement
 - however, candidates not in either of these categories should be
 aware that their applications will only be considered in
 exceptional cases.

 Interested applicants  should send their resume (plus any supporting
 background material such as recent relevant papers) to:

 Dr. Stephen Townes
 JPL 238-420
 4800 Oak Grove Drive
 Pasadena, CA 91109.

 (email: townes at bvd.jpl.nasa.gov)


From mpp at cns.brown.edu  Thu Feb 18 15:42:34 1993
From: mpp at cns.brown.edu (Michael P. Perrone)
Date: Thu, 18 Feb 93 15:42:34 EST
Subject: A computationally efficient squashing function
Message-ID: <9302182042.AA03424@cns.brown.edu>

Recently on the comp.ai.neural-nets bboard, there has been a discussion of
more computationally efficient squashing functions.  Some colleagues of
mine suggested that many members of the Connectionist mailing list may not
have access to the comp.ai.neural-nets bboard; so I have included a summary
below.

Michael

------------------------------------------------------ 
David L. Elliot mentioned using the following neuron activation function:

                      	              x
                            f(x) = -------
                                   1 + |x|

He argues that this function has the same qualitative properties of the
hyperbolic tangent function but in practice faster to calculate.

I have suggested a similar speed-up for radial basis function networks:

                      	              1
                            f(x) = -------
                                   1 + x^2

which avoids the transcendental calculation associated with gaussian RBF
nets.

I have run simulations using the above squashing function in various
backprop networks.  The performance is comparable (sometimes worse
sometimes better) to usual training using hyperbolic tangents.  I also
found that the performance of networks varied very little when the
activation functions were switched (i.e. two networks with identical
weights but different activation functions will have comparable performance
on the same data).  I tested these results on two databases: the NIST OCR
database (preprocessed by Nestor Inc.) and the Turk and Pentland human face
database.

--------------------------------------------------------------------------------
Michael P. Perrone                                      Email: mpp at cns.brown.edu
Institute for Brain and Neural Systems                  Tel:   401-863-3920
Brown University                                        Fax:   401-863-3934
Providence, RI 02912


From henrik at robots.ox.ac.uk  Fri Feb 19 11:47:16 1993
From: henrik at robots.ox.ac.uk (henrik@robots.ox.ac.uk)
Date: Fri, 19 Feb 93 16:47:16 GMT
Subject: Squashing functions
Message-ID: <9302191647.AA05729@cato.robots.ox.ac.uk>


Any interesting squashing function can be stored in a table of negligible size
(eg 256) with very high accuracy if linear (or higher) interpolation is used.
So, on a RISC workstation, there is no need for improvements. If you deal with
analog VLSI, anything goes, though ...
Cheers, henrik at robots.ox.ac.uk


From cateau at tkyux.phys.s.u-tokyo.ac.jp  Sat Feb 20 01:11:11 1993
From: cateau at tkyux.phys.s.u-tokyo.ac.jp (Hideyuki Cateau)
Date: Sat, 20 Feb 93 15:11:11 +0900
Subject: TR:Univeral Power law
Message-ID: <9302200611.AA21000@tkyux.phys.s.u-tokyo.ac.jp>


  I and my collaborators previously reported that there is a beautiful 
power law in the pace of the memory of Back Prop.  We found a reaction 
from one of networkers that the law was established only in the special model.

 This time we performed an extensive simulation to show the law is fairly 
universal in the technical report:cateau.univ.tar.Z,

        Universal Power law in feed forward networks


                       H.Cateau

                   Department of Physics
                   University of Tokyo
 

Abstract:

The power law in the pace of the memory, which was previously 
reported for the encoder, is shown to hold  
universally for general feed forward networks.    
An extensive simulation on wide variety of  feed forward networks  
shows this and reveals a lot of interesting new observations.  


The PostScript for this paper may be retrieved in the usual fashion:


  unix> ftp archive.cis.ohio-state.edu
  (log in as user 'anonymous', e-mail address as password)
  ftp> binary
  ftp> cd pub/neuroprose
  ftp> get cateau.univ.tar.Z
  ftp> quit
  unix> uncompress cateau.univ.tar.Z
  unix> tar xvfo  cateau.univ.tar
      Then you get three PS files:short.ps fig1.ps fig2.ps
  unix> lpr short.ps
  unix> lpr fig1.ps
  unix> lpr fig2.ps


Hideyuki Cateau
Particle theory group, Department of Physics,University of Tokyo,7-3-1,
Hongo,Bunkyoku,113 Japan
e-mail:cateau at tkyux.phys.s.u-tokyo.ac.jp


From soller at asylum.cs.utah.edu  Fri Feb 19 16:09:43 1993
From: soller at asylum.cs.utah.edu (Jerome Soller)
Date: Fri, 19 Feb 93 14:09:43 -0700
Subject: Industrial Position in Artificial Intelligence and/or Neural Networks
Message-ID: <9302192109.AA22408@asylum.cs.utah.edu>


	I have just been made aware of a job opening in artificial
intelligence and/or neural networks in southeast Ogden, UT.  This
company maintains strong technical interaction with existing industrial,
U.S. government laboratory, and university strengths in Utah.  Ogden
is a half hour to 45 minute drive from Salt Lake City, UT.
For further information, contact Dale Sanders at 801-625-8343  or
dsanders at bmd.trw.com .  The full job description is listed below.
					Sincerely,

				Jerome Soller
				U. of Utah Department of Computer Science
			and 	VA Geriatric, Research, Education and 
					Clinical Center

Knowledge engineering and expert systems development.  Requires
five years formal software development experience, including two years
expert systems development.  Requires experience implementing
at least one working expert system.  Requires familiarity with expert
systems development tools and DoD specification practices.  Experience with
neural nets or fuzzy logic systems may qualify as equivalent experience 
to expert systems development.  Familiarity with Ada, C/C++, database design, 
and probabilistic risk assessment strongly desired.  Requires strong 
communication and customer interface skills.  Minimum degree:  BS in 
computer science, engineering, math, or physical science.  M.S. or Ph.D.
preferred.  U.S. Citizenship is required.  Relocation funding is limited.  


From delliott at eng.umd.edu  Fri Feb 19 15:22:38 1993
From: delliott at eng.umd.edu (David L. Elliott)
Date: Fri, 19 Feb 1993 15:22:38 -0500
Subject: Abstract
Message-ID: <199302192022.AA03327@verdi.eng.umd.edu>

			      ABSTRACT

      A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS

    TR 93-8, Institute for Systems Research, University of Maryland

  by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
                          January 29, 1993
The activation function  s(x) = x/(1 + |x|) is proposed for use in
digital simulation of neural networks, on the grounds that the 
computational operation count for this function is much smaller than
for those using exponentials and that it satisfies the simple differential
equation  s' = (1 + |s|)^2,  which generalizes the logistic equation.
The full report, a work-in-progress, is available in LaTeX or PostScript
form (two pages + titlepage) by request to delliott at src.umd.edu.


From tony at aivru.shef.ac.uk  Fri Feb 19 05:59:46 1993
From: tony at aivru.shef.ac.uk (Tony_Prescott)
Date: Fri, 19 Feb 93 10:59:46 GMT
Subject: lectureship
Message-ID: <9302191059.AA23937@aivru>


		LECTURESHIP IN COGNITIVE SCIENCE
		  University of Sheffield, UK.

Applications are invited for the above post tenable from 1st October 1993
for three years in the first instance but with expectation of renewal.
Preference will be given to candidates with a PhD in Cognitive Science,
Artificial Intelligence, Cognitive Psychology, Computer Science, Robotics,
or related disciplines.

The Cognitive Science degree is an integrated course taught by the departments
of Psychology and Computer Science. Research in Cognitive Science was highly
evaluated in the recent UFC research evaluation exercise, special areas of interest being vision, speech, language, neural networks, and learning. The
successful candidate will be expected to undertake research vigorously.
Supervision of programming projects will be required, hence considerable 
experience with Lisp, Prolog, and/or C is essential.

It is expected that the appointment will be made on the Lecturer A scale
(13,400-18,576 pounds(uk) p.a.) according to age and experience but enquiries
from more experienced staff able to bring research resources are welcomed.

Informal enquiries to Professor John P Frisby 044-(0)742-826538 or e-mail
jpf at aivru.sheffield.ac.uk.  Further particulars from the director of Personnel
Services, The University, Sheffield S10 2TN, UK, to whom all applications
including a cv and the names and addresses of three referees (6 copies of all
documents) should be sent by 1 April 1993.

Short-listed candidates will be invited to Sheffield for interview for which
travel expenses (within the UK only) will be funded.

Current permanent research staff in Cognitive Science at Sheffield include: 
	Prof John Frisby (visual psychophysics),
	Prof John Mayhew (computer vision, robotics, neural networks)
	Prof Yorik Wilks (natural language understanding)
	Dr Phil Green (speech recognition)
	Dr John Porrill (computer vision)
	Dr Paul McKevitt (natural language understanding)
	Dr Peter Scott (computer assisted learning)
	Dr Rod Nicolson (human learning)
	Dr Paul Dean (neuroscience, neural networks)
	Mr Tony Prescott (neural networks, comparative cog sci)


From delliott at src.umd.edu  Sat Feb 20 15:23:57 1993
From: delliott at src.umd.edu (David L. Elliott)
Date: Sat, 20 Feb 1993 15:23:57 -0500
Subject: Corrected Abstract
Message-ID: <199302202023.AA12407@newra.src.umd.edu>


			      ABSTRACT [corrected]

      A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS

    TR 93-8, Institute for Systems Research, University of Maryland

  by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
                          January 29, 1993
The activation function  s(x) = x/(1 + |x|) is proposed for use in
digital simulation of neural networks, on the grounds that the 
computational operation count for this function is much smaller than
for those using exponentials and that it satisfies the simple differential
equation  s' = (1 - |s|)^2,  which generalizes the logistic equation.
The full report, a work-in-progress, is available in LaTeX or PostScript
form (two pages + titlepage) by request to delliott at src.umd.edu.


Thanks to Michael Perrone for calling my attention to the typo in s'.


From raina at max.ee.lsu.edu  Sat Feb 20 17:37:45 1993
From: raina at max.ee.lsu.edu (Praveen Raina)
Date: Sat, 20 Feb 93 16:37:45 CST
Subject: No subject
Message-ID: <9302202237.AA13139@max.ee.lsu.edu>

The following comparison between the backpropagation and
Kak algorithm for training feedforward networks will be
of interest to many.

We took 52 training samples each having 25 input neurons
and 3 output neurons.The training data taken was monthly
price index of a commodity for 60 months. Monthly prices 
were normalised and quantized into 3 bit binary sequence.
Each training sample represented prices taken over a period
of 8 months (8X3=24 input neurons + 1 neuron for bias).The
size of the learning window was fixed as 1 month.Binary 
values were used as the input for both BP and Kak algorithm.
For BP the learning rate was taken as 0.45 and momentum
equal to 0.55.

The training samples were trained on IBM RISC 6000 machine.
The training time for backpropagation was 4 minutes 5 seconds
and the total number of iterations was 6101.The training time
for the Kak algorithm was 5 seconds and the total number of
iterations was 875. Thus, for this example the learning advantage
in the Kak algorithm is 49. For larger examples the advantage
becomes even greater.
- Praveen Raina.
 

From unni at neuro.cs.gmr.com  Sat Feb 20 14:57:13 1993
From: unni at neuro.cs.gmr.com (K.P.Unnikrishnan)
Date: Sat, 20 Feb 93 14:57:13 EST
Subject: A NEURAL COMPUTATION course reading list
Message-ID: <9302201957.AA22392@neuro.cs.gmr.com>

Folks:
	Here is the reading list for a course I offered last semester at Univ. of
Michigan. 

Unnikrishnan
---------------------------------------------------------------

READING LIST FOR THE COURSE "NEURAL COMPUTATION"
EECS-598-6 (FALL 1992), UNIVERSITY OF MICHIGAN

INSTRUCTOR: K. P. UNNIKRISHNAN
-----------------------------------------------

A. COMPUTATION AND CODING IN THE NERVOUS SYSTEM

1. Hodgkin, A.L., and Huxley, A.F. A quantitative description of membrane
current and its application to conduction and excitation in nerve. J. Physiol.
117, 500-544 (1952).

2a. Del Castillo, J., and Katz, B. Quantal components of the end-plate
potential. J. Physiol. 124, 560-573 (1954).

2b. Del Castillo, J., and Katz, B. Statistical factors involved in neuromuscular
facilitation and depression. J. Physiol. 124, 574-585 (1954).

3. Rall, W. Cable theory for dendritic neurons. In: Methods in neural
modeling (Koch and Segev, eds.) pp. 9-62 (1989).

4. Koch, C., and Poggio, T. Biophysics of computation: neurons, synapses and
membranes. In: Synaptic function (Edelman, Gall, and Cowan, eds.) pp.
637-698 (1987).

B. SENSORY PROCESSING IN VISUAL AND AUDITORY SYSTEMS

1. Werblin, F.S., and Dowling, J.E. Organization of the retina of the mudpuppy,
Necturus maculosus: II. Intracellular recording. J. Neurophysiol. 32, 339-355
(1969).

2a. Barlow H.B., and Levick, W.R. The mechanism of directionally selective
units in rabbit's retina. J. Physiol. 178, 477-504 (1965).

2b. Lettvin, J.Y., Maturana, H.R., McCulloch, W.S., and Pitts, W.H. What the
frog's eye tells the frogs's brain. Proc. IRE 47, 1940-1951 (1959).

3. Hubel, D.H., and Wiesel, T.N. Receptive fields, binocular interaction and
functional architecture in the cat's visual cortex. J. Physiol. 160, 106-154
(1962).

4a. Suga, N. Cortical computational maps for auditory imaging. Neural Networks,
3, 3-21 (1990).

4b. Simmons, J.A. A view of the world through the bat's ear: the formation
of acoustic images in echolocation. Cognition, 33 155-199 (1989).


C. MODELS OF SENSORY SYSTEMS

1. Hect,S., Shlaer, S., and Pirenne, M.H. Energy, quanta, and vision. J. Gen.
Physiol. 25, 819-840 (1942).

2. Julesz, B., and Bergen, J.R. Textons, the fundamental elements in 
preattentive vision and perception of textures. Bell Sys. Tech. J. 62, 
1619-1645 (1983).

3a. Harth, E., Unnikrishnan, K.P., and Pandya, A.S. The inversion of sensory
processing by feedback pathways: a model of visual cognitive functions. 
science 237, 184-187 (1987).

3b. Harth, E., Pandya, A.S., and Unnikrishnan, K.P. Optimization of cortical
responses by feedback modification and synthesis of sensory afferents. A model
of perception and rem sleep. Concepts Neurosci. 1, 53-68 (1990).

3c. Koch, C. The action of the corticofugal pathway on sensory thalamic
nuclei: A hypothesis. Neurosci. 23, 399-406 (1987).

4a. Singer, W. et al., Formation of cortical cell assemblies. In: CSH Symposia 
on Quant. Biol. 55, pp. 939-952 (1990). 

4b. Eckhorn, R., Reitboeck, H.J., Arndt, M., and Dicke, P. Feature linking via
synchronization among distributed assemblies: Simulations of results from 
cat visual cortex. Neural Comp. 293-307 (1990). 

5. Reichardt, W., and Poggio, T. Visual control of orientation behavior in
the fly. Part I. A quantitative analysis. Q. Rev. Biophys. 9, 311-375 (1976).


D. ARTIFICIAL NEURAL NETWORKS

1a. Block, H.D. The perceptron: a model for brain functioning. Rev. Mod. Phy.
34, 123-135 (1962).

1b. Minsky, M.L., and Papert, S.A. Perceptrons. pp. 62-68 (1988).

2a. Hornik, K., Stinchcombe, M., and White, H. Multilayer feedforward 
networks are universal approximators. Neural Networks 2, 359-366 (1989).

2b. Lapedes, A., and Farber, R. How neural nets work. In: Neural Info. Proc.
Sys. (Anderson, ed.) pp. 442-456 (1987).

3a. Ackley, D.H., Hinton, G.E., and Sejnowski, T.J. A learning algorithm for
boltzmann machines. Cog. Sci. 9, 147-169 (1985).

3b. Hopfield, J.J. Learning algorithms and probability distributions in
feed-forward and feed-back networks. PNAS, USA. 84, 8429-8433 (1987).

4. Tank, D.W., and Hopfield, J.J. Simple neural optimization networks:
An A/D converter, signal decision circuit, and linear programming circuit.
IEEE Tr. Cir. Sys. 33, 533-541 (1986).

E. NEURAL NETWOK APPLICATIONS

1. LeCun, Y., et al., Backpropagation applied to handwritten zip code 
recognition. Neural Comp. 1, 541-551 (1990).

2. Lapedes, A., and Farber, R. Nonlinear signal processing using neural
networks. LA-UR-87-2662, Los Alamos Natl. Lab. (1987).

3. Unnikrishnan, K.P., Hopfield, J.J., and Tank, D.W. Connected-digit 
speaker-dependent speech recognition using a neural network with time-delayed
connections. IEEE Tr. ASSP. 39, 698-713 (1991).

4a. De Vries, B., and Principe, J.C. The gamma model - a new neural model for
temporal processing. Neural Networks 5, 565-576 (1992).

4b. Poddar, P., and Unnikrishnan, K.P. Memory neuron networks: a prolegomenon.
GMR-7493, GM Res. Labs. (1991).

5. Narendra, K.S., and Parthasarathy, K. Gradient methods for the optimization
of dynamical systems containing neural networks. IEEE Tr. NN 2, 252-262 (1991).


F. HARDWARE IMPLEMENTATIONS

1a. Mahowald, M.A., and Mead, C. Silicon retina. In: Analog VLSI and neural
systems (Mead). pp. 257-278 (1989).

1b. Mahowald, M.A., and Douglas, R. A silicon neuron. Nature 354, 515-518 
(1991).

2. Mueller, P. et al. Design and fabrication of VLSI components for a 
general purpose analog computer. In: Proc. IEEE workshop VLSI neural sys.
(Mead, ed.) pp. xx-xx (1989).

3. Graf, H.P., Jackel, L.D., and Hubbard, W.E. VLSI implementation of
a neural network model. Computer 2, 41-49 (1988).


G. ISSUES ON LEARNING

1. Geman, S., Bienenstock, E., and Doursat, R. Neural networks and the
bias/variance dilema. Neural Comp. 4, 1-58 (1992).

2. Brown, T.H., Kairiss, E.W., and Keenan, C.L. Hebbian synapses: Biophysical
mechanisms and algorithms. Ann. Rev. Neurosci. 13, 475-511 (1990).

3. Haussler, D. Quantifying inductive bias: AI learning algorithms and 
valiant's learning framework. AI 36, 177-221 (1988).

4. Reeke, G.N. Jr., and Edelman, G.M. Real brains and artificial intelligence.
Daedalus 117, 143-173 (1988). 

5. White, H. Learning in artificial neural networks: a statistical
perspective. Neural Comp. 1, 425-464 (1989).

----------------------------------------------------------------------
SUPPLEMENTAL READING

Nehr, E., and Sakmann, B. Single channel currents recorded from membrane 
of denervated frog muscle fibers. Nature 260, 779-781 (1976).

Rall, W. Core conductor theory and cable properties of neurons. In: Handbook
Physiol. (Brrokhart, Mountcastle, and Kandel eds.) pp. 39-97 (1977).

Shepherd, G.M., and Koch, C. Introduction to synaptic circuits. In: The 
synaptic organization of the brain (Shepherd, ed.) pp. 3-31 (1990).

Junge, D. Synaptic transmission. In: nerve and muscle excitation (Junge)
pp. 149-178 (1981).

Scott, A.C. The electrophysics of a nerve fiber. Rev. Mod. Phy. 47, 487-533
(1975).

Enroth-Cugell, C., and Robson, J.G. The contrast sensitivity of retinal
ganglion cells of the cat. J. Physiol. 187, 517-552 (1966).

Felleman, D.J., and Van Essen, D.C. Distributed hierarchical processing in the
primate cerebral cortex. Cerebral Cortex, 1, 1-47 (1991).

Julesz, B. Early vision and focal attention. Rev. Mod. Phy.63, 735-772 (1991).

Sejnowski, T.J., Koch, C., and Churchland, P.S. Computational neuroscience.
Science 241, 1299-1302 (1988).

Churchland, P.S., and Sejnowski, T.J. Perspectives on Cognitive Neuroscience.
Science 242, 741-745 (1988).

McCulloch, W.S., and Pitts, W. A logical calculus of ideas immanent in
nervous activity. Bull. Math. Biophy. 5, 115-133 (1943).

Hopfield, J.J. Neural networks and physical systems with emergent
collective computational abilities. PNAS, USA. 79, 2554-2558 (1982).
 
Hopfield, J.J. Neurons with graded responses have collective computational
properties like those of two-state neurons. PNAS, USA. 81, 3088-3092 (1984).
 
Hinton, G.E., and Sejnowski, T.J. Optimal perceptual inference. Proc. IEEE
CVPR. 448-453 (1983).

Rumelhart, D.E., Hinton, G.E., and Williams, R.J. Learning representations
by back-propagating errors. Nature 323, 533-536 (1986).
 
Unnikrishnan, K.P., and Venugopal, K.P. Learning in connectionist networks
using the Alopex algorithm. Proc. IEEE IJCNN. I-926 - I-931 (1992).
 
Cowan, J.D., and Sharp, D.H. Neural nets. Quart. Rev. Biophys. 21, 365-427
(1988).

Lippmann, R.P. An introduction to computing with neural nets. IEEE ASSP
Mag. 4, 4-22 (1987).

Sompolinsky, H. Statistical mechanics of neural networks. Phy. Today 41, 70-80
(1988).

Hinton, G.E. Connectionist learning procedures. Art. Intel. 40, 185-234 (1989).


From demers at cs.ucsd.edu  Sun Feb 21 13:45:24 1993
From: demers at cs.ucsd.edu (David DeMers)
Date: Sun, 21 Feb 93 10:45:24 -0800
Subject: NIPS-5 papers: Nonlinear dimensionallity reduction / Inverse kinematics
Message-ID: <9302211845.AA24988@beowulf>


Non-Linear Dimensionality Reduction

David DeMers & Garrison Cottrell

ABSTRACT
--------
A method for creating a non--linear encoder--decoder for 
multidimensional data with compact representations is presented.  
The commonly used technique of autoassociation is extended to 
allow non--linear representations, and an objective function which
penalizes activations of individual hidden units is
shown to result in minimum dimensional encodings with
respect to allowable error in reconstruction.


============================================================

Global Regularization of Inverse Kinematics for Redundant Manipulators

David DeMers & Kenneth Kreutz-Delgado

ABSTRACT
--------
The inverse kinematics problem for redundant manipulators is
ill--posed and nonlinear.  There are two fundamentally different 
issues which result in the need for some form of regularization;
the existence of multiple solution branches (global ill--posedness) 
and the existence of excess degrees of freedom (local ill--posedness).
For certain classes of manipulators, learning methods applied to 
input--output data generated from the forward function can be used to
globally regularize the problem by partitioning the domain of the 
forward mapping into a finite set of regions over which the inverse 
problem is well--posed.  Local regularization can be accomplished
by an appropriate parameterization of the redundancy consistently over  
each region.  As a result, the ill--posed problem can be transformed 
into a finite set of well--posed problems.  Each can then be
solved separately to construct approximate direct inverse functions.

=============================================================

Preprints are available from the neuroprose archive

Retrievable in the usual way:

unix> ftp archive.cis.ohio-state.edu (128.146.8.52)
login as "anonymous", password = <your-email-address>
ftp> cd pub/neuroprose
ftp> binary
ftp> get demers.nips92-nldr.ps.Z
ftp> get demers.nips92-robot.ps.Z
ftp> bye
unix> uncompress demers.*.ps.Z 
unix> lpr -s demers.nips92-nldr.ps.Z
unix> lpr -s demers.nips92-robot.ps.Z

(or however you print *LARGE* PostScript files)


These papers will appear in
S.J. Hanson, J.E. Moody & C.L. Giles, eds,
Advances in Neural Information Processing Systems 5
(Morgan Kaufmann, 1993).


Dave DeMers			 	        demers at cs.ucsd.edu
Computer Science & Engineering	0114		demers%cs at ucsd.bitnet
UC San Diego					...!ucsd!cs!demers
La Jolla, CA 92093-0114	(619) 534-0688, or -8187, FAX: (619) 534-7029


From srikanth at rex.cs.tulane.edu  Sun Feb 21 14:41:45 1993
From: srikanth at rex.cs.tulane.edu (R. Srikanth)
Date: Sun, 21 Feb 93 13:41:45 CST
Subject: Abstract, New Squashing function...
In-Reply-To: <199302192022.AA03327@verdi.eng.umd.edu>; from "David L. Elliott" at Feb 19, 93 3:22 pm
Message-ID: <9302211941.AA17332@hercules.cs.tulane.edu>

>
>                             ABSTRACT
>
>       A BETTER ACTIVATION FUNCTION FOR ARTIFICIAL NEURAL NETWORKS
>
>     TR 93-8, Institute for Systems Research, University of Maryland
>
>   by David L. Elliott--  ISR, NeuroDyne, Inc., and Washington University
>                           January 29, 1993
> The activation function  s(x) = x/(1 + |x|) is proposed for use in
> digital simulation of neural networks, on the grounds that the
> computational operation count for this function is much smaller than
> for those using exponentials and that it satisfies the simple differential
> equation  s' = (1 + |s|)^2,  which generalizes the logistic equation.
> The full report, a work-in-progress, is available in LaTeX or PostScript
> form (two pages + titlepage) by request to delliott at src.umd.edu.
>
>

This squashing function while not widely in use, is and has been used by
few others. George Georgiou uses it for a complex back propagation network.
Not only does the activation function enable him to model a complex BP but
also seems to lend itself to easier implementation.

For more information on complex domain backprop, contact
Dr. George Georgiou at  georgiou at meridian.csci.csusb.edu


--


srikanth at cs.tulane.edu
Dept of Computer Science,
Tulane University,
New Orleans, La - 70118


From delliott at src.umd.edu  Sun Feb 21 15:00:03 1993
From: delliott at src.umd.edu (David L. Elliott)
Date: Sun, 21 Feb 1993 15:00:03 -0500
Subject: Response
Message-ID: <199302212000.AA17583@newra.src.umd.edu>

Henrik-

Thanks for your comment; you wrote:
"Any interesting squashing function can be stored in a table of negligible size
(eg 256) with very high accuracy if linear (or higher) interpolation is used."

I think you are right *if the domain of the map
is compact* a priori. Otherwise the approximation must eventually become
constant for large x, and this has bad consequences for backpropagation
algorithms. For some other training methods, perhaps not.

David
 

From gluck at pavlov.rutgers.edu  Mon Feb 22 08:05:05 1993
From: gluck at pavlov.rutgers.edu (Mark Gluck)
Date: Mon, 22 Feb 93 08:05:05 EST
Subject: Neural Computation & Cognition: Opening for NN Programmer
Message-ID: <9302221305.AA04474@james.rutgers.edu>


       POSITION AVAILABLE: NEURAL-NETWORK RESEARCH PROGRAMMER

At the Center for Neuroscience at Rutgers-Newark, we have an opening
for a full or part-time research programmer to assist in developing
neural-network simulations. The research involves integrated
experimental and theoretical analyses of the cognitive and neural bases
of learning and memory. The focus of this research is on understanding
the underlying neurobiological mechanisms for complex learning
behaviors in both animals and humans.

Substantial prior experience and understanding of neural-network
theories and algorithms is required. Applicants should have a high
level of programming experience (C or Pascal), and familiarity with
Macintosh and/or UNIX. Strong English-language communication and
writing skills are essential.

*** This position would be particularly appropriate for a graduating
college senior who seeks "hands-on" research experience prior to
graduate school in the cognitive, neural, or computational sciences ***

Applications are being accepted now for an immediate start-date or for
starting in June or September of this year. NOTE TO N. CALIF.
APPLICANTS:  Interviews for applicants from the San Francisco/Silicon
Valley area will be conducted at Stanford in late March. The
Neuroscience Center is located 20 minutes outside of New York City in
northern New Jersey.

For further information, please send an email or hard-copy letter
describe your relevant background, experience, and career goals to:

______________________________________________________________________

Dr. Mark A. Gluck
Center for Molecular & Behavioral Neuroscience
Rutgers University
197 University Ave.
Newark, New Jersey  07102

        Phone:  (201) 648-1080 (Ext. 3221)
        Fax:    (201) 648-1272
        Email:  gluck at pavlov.rutgers.edu 


From peleg at cs.huji.ac.il  Tue Feb 23 15:38:02 1993
From: peleg at cs.huji.ac.il (Shmuel Peleg)
Date: Tue, 23 Feb 93 22:38:02 +0200
Subject: CFP: 12-ICPR, Int Conf Pattern Recognition, Jerusalem, 1994
Message-ID: <9302232038.AA28915@humus.cs.huji.ac.il>

===============================================================================
                       CALL FOR PAPERS - 12th ICPR
              International Conferences on Pattern Recognition
                     Oct 9-13, 1994, Jerusalem, Israel

The 12th ICPR of the International Association for Pattern Recognition will be
organized as a set of four conferences, each dealing with a special topic. The
program for each individual conference will be organized by its own Program 
Committee. Papers describing applications are encouraged, and will be reviewed
by a special Applications Committee. An award will be given for the best 
industry-related paper presented at the conference. Considerations for this 
award will include innovative applications, robust performance, and 
contributions to industrial progress. An exhibition will also be held.
The conference proceedings are published by the IEEE Computer Society Press.

GENERAL CO-CHAIRS:   S. Ullman - Weizmann Inst. (shimon at wisdom.weizmann.ac.il)
                     S. Peleg - The Hebrew University (peleg at cs.huji.ac.il)
LOCAL ARRANGEMENTS:  Y. Yeshurun - Tel-Aviv University (hezy at math.tau.ac.il)
INDUSTRIAL & APPLICATIONS LIAISON: M. Ejiri - Hitachi (ejiri at crl.hitachi.co.jp)

                          CONFERENCE DESCRIPTIONS

1. COMPUTER VISION AND IMAGE PROCESSING, T. Huang - University of Illinois
   Early vision and segmentation; image representation; shape and texture
   analysis; motion and stereo; range imaging and remote sensing; color;
   3D representation and recognition.

2. PATTERN RECOGNITION AND NEURAL NETWORKS, N. Tishby - The Hebrew University
   Statistical, syntactic, and hybrid pattern recognition techniques; neural
   networks for associative memory, classification, and temporal processing;
   biologically oriented neural networks models; biomedical applications.

3. SIGNAL PROCESSING, D. Malah - Technion, Israel Institute of Technology
   Analysis, representation, coding, and recognition of signals; signal and 
   image enhancement and restoration; scale-space and joint time-frequency
   analysis and representation; speech coding and recognition; image and video
   coding; auditory scene analysis.

4. PARALLEL COMPUTING, S. Tanimoto - University of Washington
   Parallel architectures and algorithms for pattern recognition, vision, and
   signal processing; special languages, programming tools, and applications of
   multiprocessor and distributed methods; design of chips, real-time hardware,
   and neural networks; recognition using multiple sensory modalities.

PAPER SUBMISSION DEADLINE: February 1, 1994.
Notification of Acceptance: May 1994. Camera-Ready Copy: June 1994.

Send four copies of paper to: 12th ICPR, c/o International, 10 Rothschild blvd,
65121 Tel Aviv, ISRAEL. Tel. +972(3)510-2538, Fax +972(3)660-604

Each manuscript should include the following:
1. A Summary Page addressing these topics:
   - To which of the four conference is the paper submitted?
   - What is the paper about? - What is the original contribution of this work?
   - Does the paper mainly describe an application, and should be reviewed by
     the applications committee?
2. The paper, limited in length to 4000 words. This is the estimated length
   of the proceedings version.
            
For further information contact the secretariat at the above address, or use
E-mail: icpr at math.tau.ac.il .
===============================================================================


From prechelt at ira.uka.de  Tue Feb 23 08:55:11 1993
From: prechelt at ira.uka.de (prechelt@ira.uka.de)
Date: Tue, 23 Feb 93 14:55:11 +0100
Subject: Squashing functions
In-Reply-To: Your message of Fri, 19 Feb 93 16:47:16 +0000. <9302191647.AA05729@cato.robots.ox.ac.uk>
Message-ID: <mailman.582.1149591275.29955.connectionists@cs.cmu.edu>


> Any interesting squashing function can be stored in a table of negligible size
> (eg 256) with very high accuracy if linear (or higher) interpolation is used.

256 points are not always negligible:
On a fine-grain massively parallel machine such as the MasPar MP-1,
the 256*4 bytes needed to store it can consume a considerable amount of 
the available memory.

Our MP-1216A has 16384 processors with only 16 kB memory each.

Another point: On this machine, I am not sure whether interpolating from
such a table would really be faster than, say, a third order Taylor approximation
of the sigmoid.

  Lutz

Lutz Prechelt   (email: prechelt at ira.uka.de)            | Whenever you 
Institut fuer Programmstrukturen und Datenorganisation  | complicate things,
Universitaet Karlsruhe;  D-7500 Karlsruhe 1;  Germany   | they get
(Voice: ++49/721/608-4068, FAX: ++49/721/694092)        | less simple.


From henrik at robots.ox.ac.uk  Tue Feb 23 13:56:11 1993
From: henrik at robots.ox.ac.uk (henrik@robots.ox.ac.uk)
Date: Tue, 23 Feb 93 18:56:11 GMT
Subject: Squashing functions (continued)
Message-ID: <9302231856.AA22594@cato.robots.ox.ac.uk>


The saturation problem ('the actviation function gets constant for large |x|')
can usually be solved by putting the derivative of the act. function into a 
table as well. You can then cheat a bit by not setting it to zero at large |x|.

Concerning memory requirements (eg, MasPar MP1). I don't see why I need 4 bytes
per table entry. According to the paper by Fahlman & Hoehfeld on limited pre-
cision, the quantization can be done with very few bits (less than 8 if tricks
are used). With interpolation you can get a pretty decent 16 bit act. value 
out of a 8bit wide table. 
Apart of that, seems to be quite complicated to put a nn on 16K processors ...
how do you do that ?
 
Cheers, henrik at robots.ox.ac.uk


From xueh at microsoft.com  Wed Feb 24 01:19:47 1993
From: xueh at microsoft.com (Xuedong Huang)
Date: Tue, 23 Feb 93 22:19:47 PST 
Subject: Microsoft Speech Research
Message-ID: <9302240620.AA07680@netmail.microsoft.com>


As you may know, I've started a new speech group here at Microsoft. For 
your information, I have enclosed the full advertisement we have been 
using to publicize the openings.  If you are interested in joining MS, 
I strongly encourage you to apply and we will look forward to following 
up with you.

------------------------------------------------------------
THE FUTURE IS HERE.

Speech Recognition.  Intuitive Graphical Interfaces.
Sophisticated User Agents.  Advanced Operating Systems.
Robust Environments.  World Class Applications.

	Who's Pulling It All Together?

Microsoft.  We're setting the stage for the future of
computing, building a world class research group and
leveraging a solid foundation of object based technology
and scalable operating systems.
	What's more, we're extending the recognition
paradigm, employing advanced processor and RISC-based
architecture, and harnessing distributed networks to
connect users to worlds of information.
	We want to see more than just our own software
running.  We want to see a whole generation of users
realize the future of computing.
	Realize your future with a position in our
Speech Recognition group.


Research Software Design Engineers, Speech Recognition.

Primary responsibilities include designing and developing
User Interface and systems level software for an advanced
speech recognition system.  A minimum of 3 years demonstrated
microcomputer software design and development experience
in C is required.  Knowledge of Windows programming, speech
recognition systems, hidden Markov model theory,  statistics,
DSP,  or user interface development is preferred.  A BA/BS
in computer science or related discipline is required.  An
advanced degree (MS or Ph.D.) in a related discipline is
preferred.


Researchers, Speech Recognition.

Primary responsibilities include research on stochastic
modeling techniques to be applied to an advanced speech
recognition system.  A minimum of 4 years demonstrated
research excellence in the area of speech recognition
or spoken language understanding systems is required.
Knowledge of Windows and real-time C programming for
microcomputers, hidden Markov model theory, decoder
systems design, DSP, and spoken language understanding
is preferred.  A MA/MS in CS or related discipline is
required.  A PhD degree in CS, EE, or related discipline
is preferred.


	Make The Most of Your Future.

At Microsoft, our technical leadership and strong
Software Developers and Researchers stay ahead of the
times, creating vision and turning it into reality.

To apply, send your resume and cover letter, noting
"ATTN: N5935-0223" to:

Surface:
	Microsoft Recruiting
	ATTN: N5935-0223
	One Microsoft Way
	Redmond, WA  98052-6399

Email:
	ASCII ONLY
	y-wait at microsoft.com.us

Microsoft is an equal opportunity employer working to
increase workforce diversity.


From john at cs.uow.edu.au  Fri Feb 26 13:56:21 1993
From: john at cs.uow.edu.au (John Fulcher)
Date: Fri, 26 Feb 93 13:56:21 EST
Subject: submission
Message-ID: <199302260256.AA25570@wraith.cs.uow.edu.au>

COMPUTER STANDARDS & INTERFACES (North-Holland)

Forthcoming Special Issue on ANN Standards

ADDENDUM TO ORIGINAL POSTING

Prompted by enquiries from several people regarding my original Call for
Papers posting, I felt I should offer the following additional information
(clarification).

By ANN "Standards" we do not mean exclusively formal standards (in the ISO,
IEEE, ANSI, CCITT etc. sense), although naturally enough we will be
including papers on activities in these areas.

"Standards" should be interpreted in its most general sense, namely as 
standard APPROACHES (e.g. the backpropagation algorithm & its many variants).
Thus if you have a paper on some (any?) aspect of ANNs, provided it is
prefaced by a summary of the standard approach(es) in that particular area,
it could well be suitable for inclusion in this special issue of CS&I. If in
doubt, post fax or email a copy by April 30th to:

John Fulcher,
Department of Computer Science,
University of Wollongong,
Northfields Avenue,
Wollongong NSW 2522,
Australia.

fax: +61 42 213262
email: john at cs.uow.edu.au.oz


From terry at helmholtz.sdsc.edu  Thu Feb 25 14:57:05 1993
From: terry at helmholtz.sdsc.edu (Terry Sejnowski)
Date: Thu, 25 Feb 93 11:57:05 PST
Subject: Neural Computation 5:2
Message-ID: <9302251957.AA14806@helmholtz.sdsc.edu>

NEURAL COMPUTATION - Volume 5 - Issue 2 - March 1993

Review

	Neural Networks and Non-Linear Adaptive Filtering:
	Unifying Concepts and New Algorithms
		O. Nerrand, P. Roussel-Ragot, L. Personnaz, 
		G. Dreyfus and S. Marcos
		
Notes

	Fast Calculation of Synaptic Conductances
		Rajagopal Srinivasan and Hillel J. Chiel

	The Variance of Covariance Rules for Associative
	Matrix Memories and Reinforcement Learning	
		Peter Dayan and Terrence J. Sejnowski

	Optimal Network Construction by Minimum 
	Description Length
		Gary D. Kendall and Trevor J. Hall

Letters

	A Neural Network Model of Inhibitory Information
	Processing in Aplysia
		Diana E.J. Blazis, Thomas M. Fischer and Thomas J. Carew

	Computational Diversity in a Formal Model of the
	Insect Olfactory Macroglomerulus
		C. Linster, C. Masson, M. Kerszberg, L. Personnaz 
		and G. Dreyfus

	Learning Competition and Cooperation
		Sungzoon Cho and James A. Reggia	

	Constraints on Synchronizing Oscillator Networks
		David E. Cairns, Roland J. Baddeley and Leslie S. Smith

	Learning Mixture Models of Spatial Coherence
		Suzanna Becker and Geoffrey E. Hinton

	Hints and the VC Dimension
		Yaser S. Abu-Mostafa

	Redundancy Reduction as a Strategy for Unsupervised
	Learning
		A. Norman Redlich

	Approximation and Radial-Basis-Function Networks
		Jooyoung Park and Irwin W. Sandberg

	A Polynomial Time Algorithm for Generating Neural
	Networks for Pattern Classification - its Stability 
	Properties and Some Test Results
		Somnath Mukhopadhyay, Asim Roy, Lark Sang Kim 
		and Sandeep Govil

	Neural Networks for Optimization Problems with
	Inequality Constraints - The Knapsack Problem
		Mattias Ohlsson, Carsten Peterson and Bo Soderberg

-----

SUBSCRIPTIONS - VOLUME 5 - BIMONTHLY (6 issues)

______ $40     Student
______ $65     Individual
______ $156    Institution

Add $22 for postage and handling outside USA (+7% GST for Canada).

(Back issues from Volumes 1-4 are regularly available for $28 each
to institutions and $14 each for individuals
Add $5 for postage per issue outside USA (+7% GST for Canada)

MIT Press Journals, 55 Hayward Street, Cambridge, MA 02142.
Tel: (617) 253-2889  FAX: (617) 258-6779  e-mail: hiscox at mitvma.mit.edu

-----


From mark at dcs.kcl.ac.uk  Fri Feb 26 08:25:01 1993
From: mark at dcs.kcl.ac.uk (Mark Plumbley)
Date: Fri, 26 Feb 93 13:25:01 GMT
Subject: King's College London Neural Networks MSc and PhD courses
Message-ID: <17179.9302261325@xenon.dcs.kcl.ac.uk>

Fellow Neural Networkers,

Please post or forward this announcement about our M.Sc. and Ph.D. courses
in Neural Networks  to anyone who might be interested.

Thanks,

Mark Plumbley

-------------------------------------------------------------------------
Dr. Mark D. Plumbley                                 Tel: +44 71 873 2241
Centre for Neural Networks                           Fax: +44 71 873 2017
Department of Mathematics/King's College London/Strand/London WC2R 2LS/UK
-------------------------------------------------------------------------

		      CENTRE FOR NEURAL NETWORKS
				 and
		      DEPARTMENT OF MATHEMATICS
				   
			King's College London
				Strand
			 London WC2R 2LS, UK
				   
	      M.Sc. AND Ph.D. COURSES IN NEURAL NETWORKS

---------------------------------------------------------------------
				   
	 M.Sc. in INFORMATION PROCESSING and NEURAL NETWORKS
	 ---------------------------------------------------
				   
			  A ONE YEAR COURSE
				   
			       CONTENTS
		       Dynamical Systems Theory
			   Fourier Analysis
			  Biosystems Theory
		       Advanced Neural Networks
			    Control Theory
		  Combinatorial Models of Computing
			   Digital Learning
		      Digital Signal Processing
		   Theory of Information Processing
			    Communications
			     Neurobiology
				   
			     REQUIREMENTS
    First Degree in Physics, Mathematics, Computing or Engineering

NOTE:
For 1993/94 we have 3 SERC quota awards for this course.

---------------------------------------------------------------------
				   
		      Ph.D. in NEURAL COMPUTING
		      -------------------------

A 3-year Ph.D. programme in NEURAL COMPUTING is offered to applicants
with a First degree in Mathematics, Computing, Physics or Engineering
(others will also be considered). The first year consists of courses
given under the M.Sc. in Information Processing and Neural Networks
(see attached notice). Second and third year research will be
supervised in one of the various programmes in the development and
application of temporal, non-linear and stochastic features of neurons
in visual, auditory and speech processing. There is also work in
higher level category and concept formation and episodic memory
storage. Analysis and simulation are used, both on PC's SUNs and main
frame machines, and there is a programme on the development and use of
adaptive hardware chips in VLSI for pattern and speed processing.

This work is part of the activities of the Centre for Neural Networks
in the School of Physical Sciences and Engineering, which has over 40
researchers in Neural Networks. It is one of the main centres of the
subject in the U.K.

---------------------------------------------------------------------
				   
  For further information on either of these courses please contact:
				   
			Postgraduate Secretary
		      Department of Mathematics
			King's College London
			       Strand
			 London WC2R 2LS, UK
		        MATHS at OAK.CC.KCL.AC.UK