From Connectionists-Request at CS.CMU.EDU Sun Mar 1 00:05:16 1992 From: Connectionists-Request at CS.CMU.EDU (Connectionists-Request@CS.CMU.EDU) Date: Sun, 01 Mar 92 00:05:16 EST Subject: Bi-monthly Reminder Message-ID: <15436.699426316@B.GP.CS.CMU.EDU> *** DO NOT FORWARD TO ANY OTHER LISTS *** This is an automatically posted bi-monthly reminder about how the CONNECTIONISTS list works and how to access various online resources. CONNECTIONISTS is not an edited forum like the Neuron Digest, or a free-for-all newsgroup like comp.ai.neural-nets. It's somewhere in between, relying on the self-restraint of its subscribers. Membership in CONNECTIONISTS is restricted to persons actively involved in neural net research. The following posting guidelines are designed to reduce the amount of irrelevant messages sent to the list. Before you post, please remember that this list is distributed to over a thousand busy people who don't want their time wasted on trivia. Also, many subscribers pay cash for each kbyte; they shouldn't be forced to pay for junk mail. Happy hacking. -- Dave Touretzky & Hank Wan --------------------------------------------------------------------- What to post to CONNECTIONISTS ------------------------------ - The list is primarily intended to support the discussion of technical issues relating to neural computation. - We encourage people to post the abstracts of their latest papers and tech reports. - Conferences and workshops may be announced on this list AT MOST twice: once to send out a call for papers, and once to remind non-authors about the registration deadline. A flood of repetitive announcements about the same conference is not welcome here. - Requests for ADDITIONAL references. This has been a particularly sensitive subject lately. Please try to (a) demonstrate that you have already pursued the quick, obvious routes to finding the information you desire, and (b) give people something back in return for bothering them. The easiest way to do both these things is to FIRST do the library work to find the basic references, then POST these as part of your query. Here's an example: WRONG WAY: "Can someone please mail me all references to cascade correlation?" RIGHT WAY: "I'm looking for references to work on cascade correlation. I've already read Fahlman's paper in NIPS 2, his NIPS 3 abstract, and found the code in the nn-bench archive. Is anyone aware of additional work with this algorithm? I'll summarize and post results to the list." - Announcements of job openings related to neural computation. - Short reviews of new text books related to neural computation. To send mail to everyone on the list, address it to Connectionists at CS.CMU.EDU ------------------------------------------------------------------- What NOT to post to CONNECTIONISTS: ----------------------------------- - Requests for addition to the list, change of address and other administrative matters should be sent to: "Connectionists-Request at cs.cmu.edu" (note the exact spelling: many "connectionists", one "request"). If you mention our mailing list to someone who may apply to be added to it, please make sure they use the above and NOT "Connectionists at cs.cmu.edu". - Requests for e-mail addresses of people who are believed to subscribe to CONNECTIONISTS should be sent to postmaster at appropriate-site. If the site address is unknown, send your request to Connectionists-Request at cs.cmu.edu and we'll do our best to help. A phone call to the appropriate institution may sometimes be simpler and faster. - Note that in many mail programs a reply to a message is automatically "CC"-ed to all the addresses on the "To" and "CC" lines of the original message. If the mailer you use has this property, please make sure your personal response (request for a Tech Report etc.) is NOT broadcast over the net. - Do NOT tell a friend about Connectionists at cs.cmu.edu. Tell him or her only about Connectionists-Request at cs.cmu.edu. This will save your friend from public embarrassment if she/he tries to subscribe. - Limericks should not be posted here. ------------------------------------------------------------------------------- The CONNECTIONISTS Archive: --------------------------- All e-mail messages sent to "Connectionists at cs.cmu.edu" starting 27-Feb-88 are now available for public perusal. A separate file exists for each month. The files' names are: arch.yymm where yymm stand for the obvious thing. Thus the earliest available data are in the file: arch.8802 Files ending with .Z are compressed using the standard unix compress program. To browse through these files (as well as through other files, see below) you must FTP them to your local machine. ------------------------------------------------------------------------------- How to FTP Files from the CONNECTIONISTS Archive ------------------------------------------------ 1. Open an FTP connection to host B.GP.CS.CMU.EDU (Internet address 128.2.242.8). 2. Login as user anonymous with password your username. 3. 'cd' directly to one of the following directories: /usr/connect/connectionists/archives /usr/connect/connectionists/bibliographies 4. The archives and bibliographies directories are the ONLY ones you can access. You can't even find out whether any other directories exist. If you are using the 'cd' command you must cd DIRECTLY into one of these two directories. Access will be denied to any others, including their parent directory. 5. The archives subdirectory contains back issues of the mailing list. Some bibliographies are in the bibliographies subdirectory. Problems? - contact us at "Connectionists-Request at cs.cmu.edu". ------------------------------------------------------------------------------- How to FTP Files from the Neuroprose Archive -------------------------------------------- Anonymous FTP on archive.cis.ohio-state.edu (128.146.8.52) pub/neuroprose directory This directory contains technical reports as a public service to the connectionist and neural network scientific community which has an organized mailing list (for info: connectionists-request at cs.cmu.edu) Researchers may place electronic versions of their preprints or articles in this directory, announce availability, and other interested researchers can rapidly retrieve and print the postscripts. This saves copying, postage and handling, by having the interested reader supply the paper. (Along this line, single spaced versions, if possible, will help!) To place a file, put it in the Inbox subdirectory, and send mail to pollack at cis.ohio-state.edu. Within a couple of days, I will move and protect it, and suggest a different name if necessary. When you announce a paper, you should consider whether (A) you want it automatically forwarded to other groups, like NEURON-DIGEST, (which gets posted to comp.ai.neural-networks) and if you want to provide (B) free or (C) prepaid hard copies for those unable to use FTP. If you do offer hard copies, be prepared for an onslaught. One author reported that when they allowed combination AB, the rattling around of their "free paper offer" on the worldwide data net generated over 2000 hardcopy requests! Experience dictates the preferred paradigm is to announce an FTP only version with a prominent "**DO NOT FORWARD TO OTHER GROUPS**" at the top of your announcement to the connectionist mailing list. Current naming convention is author.title.filetype[.Z] where title is enough to discriminate among the files of the same author. The filetype is usually "ps" for postscript, our desired universal printing format, but may be tex, which requires more local software than a spooler. Very large files (e.g. over 200k) must be squashed (with either a sigmoid function :) or the standard unix "compress" utility, which results in the .Z affix. To place or retrieve .Z files, make sure to issue the FTP command "BINARY" before transfering files. After retrieval, call the standard unix "uncompress" utility, which removes the .Z affix. An example of placing a file is attached as an appendix, and a shell script called Getps in the directory can perform the necessary retrival operations. For further questions contact: Jordan Pollack Assistant Professor CIS Dept/OSU Laboratory for AI Research 2036 Neil Ave Email: pollack at cis.ohio-state.edu Columbus, OH 43210 Phone: (614) 292-4890 Here is an example of naming and placing a file: gvax> cp i-was-right.txt.ps rosenblatt.reborn.ps gvax> compress rosenblatt.reborn.ps gvax> ftp archive.cis.ohio-state.edu Connected to archive.cis.ohio-state.edu. 220 archive.cis.ohio-state.edu FTP server ready. Name: anonymous 331 Guest login ok, send ident as password. Password:neuron 230 Guest login ok, access restrictions apply. ftp> binary 200 Type set to I. ftp> cd pub/neuroprose/Inbox 250 CWD command successful. ftp> put rosenblatt.reborn.ps.Z 200 PORT command successful. 150 Opening BINARY mode data connection for rosenblatt.reborn.ps.Z 226 Transfer complete. 100000 bytes sent in 3.14159 seconds ftp> quit 221 Goodbye. gvax> mail pollack at cis.ohio-state.edu Subject: file in Inbox. Jordan, I just placed the file rosenblatt.reborn.ps.Z in the Inbox. The INDEX sentence is "Boastful statements by the deceased leader of the neurocomputing field." Please let me know when it is ready to announce to Connectionists at cmu. BTW, I enjoyed reading your review of the new edition of Perceptrons! Frank ------------------------------------------------------------------------ How to FTP Files from the NN-Bench Collection --------------------------------------------- 1. Create an FTP connection from wherever you are to machine "pt.cs.cmu.edu" (128.2.254.155). 2. Log in as user "anonymous" with password your username. 3. Change remote directory to "/afs/cs/project/connect/bench". Any subdirectories of this one should also be accessible. Parent directories should not be. 4. At this point FTP should be able to get a listing of files in this directory and fetch the ones you want. Problems? - contact us at "nn-bench-request at cs.cmu.edu".  From geiger at medusa.siemens.com Mon Mar 2 17:39:57 1992 From: geiger at medusa.siemens.com (Davi Geiger) Date: Mon, 2 Mar 92 17:39:57 EST Subject: No subject Message-ID: <9203022239.AA03797@medusa.siemens.com> CALL FOR PAPERS NEURAL INFORMATION PROCESSING SYSTEMS (NIPS) -Natural and Synthetic- Monday, November 30 - Thursday, December 3, 1992 Denver, Colorado This is the sixth meeting of an inter-disciplinary conference which brings together neuroscientists, engineers, computer scientists, cognitive scientists, physicists, and mathematicians interested in all aspects of neural processing and computation. A day of tutorial presentations (Nov 30) will precede the regular session and two days of focused workshops will follow at a nearby ski area (Dec 4-5). Major categories and examples of subcategories for paper submissions are the following; Neuroscience: Studies and Analyses of Neurobiological Systems, Inhibition in cortical circuits, Signals and noise in neural computation, Theoretical Neurobiology and Neurophysics. Theory: Computational Learning Theory, Complexity Theory, Dynamical Systems, Statistical Mechanics, Probability and Statistics, Approximation Theory. Implementation and Simulation: VLSI, Optical, Software Simulators, Implementation Languages, Parallel Processor Design and Benchmarks. Algorithms and Architectures: Learning Algorithms, Constructive and Pruning Algorithms, Localized Basis Functions, Tree Structured Networks, Performance Comparisons, Recurrent Networks, Combinatorial Optimization, Genetic Algorithms. Cognitive Science & AI: Natural Language, Human Learning and Memory, Perception and Psychophysics, Symbolic Reasoning. Visual Processing: Stereopsis, Visual Motion, Recognition, Image Coding and Classification. Speech and Signal Processing: Speech Recognition, Coding, and Synthesis, Text-to-Speech, Adaptive Equalization, Nonlinear Noise Removal. Control, Navigation, and Planning: Navigation and Planning, Learning Internal Models of the World, Trajectory Planning, Robotic Motor Control, Process Control. Applications: Medical Diagnosis or Data Analysis, Financial and Economic Analysis, Timeseries Prediction, Protein Structure Prediction, Music Processing, Expert Systems. The technical program will contain plenary, contributed oral and poster presentations with no parallel sessions. All presented papers will be due (January 13, 1993) after the conference in camera-ready format and will be published by Morgan Kaufmann. Submission Procedures: Original research contributions are solicited, and will be carefully refereed. Authors must submit six copies of both a 1000-word (or less) summary and six copies of a separate single-page 50-100 word abstract clearly stating their results postmarked by May 22, 1992 (express mail is not necessary). Accepted abstracts will be published in the conference program. Summaries are for program committee use only. At the bottom of each abstract page and on the first summary page indicate preference for oral or poster presentation and specify one of the above nine broad categories and, if appropriate, sub-categories (For example: Poster, Applications- Expert Systems; Oral, Implementation-Analog VLSI). Include addresses of all authors at the front of the summary and the abstract and indicate to which author correspondence should be addressed. Submissions will not be considered that lack category information, separate abstract sheets, the required six copies, author addresses, or are late. Mail Submissions To: Jack Cowan NIPS*92 Submissions University of Chicago Dept. of Mathematics 5734 So. University Ave. Chicago IL 60637 Mail For Registration Material To: NIPS*92 Registration SIEMENS Research Center 755 College Road East Princeton, NJ, 08540 All submitting authors will be sent registration material automatically. Program committee decisions will be sent to the correspondence author only. NIPS*92 Organizing Committee: General Chair, Stephen J. Hanson, Siemens Research & Princeton University; Program Chair, Jack Cowan, University of Chicago; Publications Chair, Lee Giles, NEC; Publicity Chair, Davi Geiger, Siemens Research; Treasurer, Bob Allen, Bellcore; Local Arrangements, Chuck Anderson, Colorado State University; Program Co-Chairs: Andy Barto, U. Mass.; Jim Burr, Stanford U.; David Haussler, UCSC ; Alan Lapedes, Los Alamos; Bruce McNaughton, U. Arizona; Barlett Mel, JPL; Mike Mozer, U. Colorado; John Pearson, SRI; Terry Sejnowski, Salk Institute; David Touretzky, CMU; Alex Waibel, CMU; Halbert White, UCSD; Alan Yuille, Harvard U.; Tutorial Chair: Stephen Hanson, Workshop Chair: Gerry Tesauro, IBM Domestic Liasons: IEEE Liaison, Terrence Fine, Cornell; Government & Corporate Liaison, Lee Giles, NEC; Overseas Liasons: Mitsuo Kawato, ATR; Marwan Jabri, University of Sydney; Benny Lautrup, Niels Bohr Institute; John Bridle, RSRE; Andreas Meier, Simon Bolivar U. DEADLINE FOR SUMMARIES & ABSTRACTS IS MAY 22, 1992 (POSTMARKED) please post 9  From STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu Mon Mar 2 09:45:00 1992 From: STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu (STAY8026%IRUCCVAX.UCC.IE@bitnet.cc.cmu.edu) Date: Mon, 2 Mar 1992 14:45 GMT Subject: Composite networks Message-ID: <01GH682VYUB400105W@IRUCCVAX.UCC.IE> Hi, I am interested in modelling tasks in which invariant information from previous input-output pairs is brought to bear on the acquisition of current input-output pairs. Thus I want to use previously extracted regularity to influence current processing. Does anyone think this is feasible?? At year's memory conference in Lancaster (England) Dave Rumelhart mentioned the need to develop nets which incorporate a distinction between learning and memory and exploit the attributes of both. Thus, our present learning procedures, such as BACKPROP, are useful for combining, without interference, multiple input-output pairs, while competitive learning systems are useful for computing regularities. How about attempting to combine the two? Has anyone tried? My suspicion is that such composite networks might be usefully applied to a number of issues in natural language processing, such as, perhaps, the syntactic embeddings considered by Pollack (1990, Artificial Intelligence). At first I thought that a sequential net of the sort discussed by Hinton and Shallice (1991 Psych. Rev.) might fit the bill, but now I'm not sure. Any ideas or suggestions? P. J. Hampson University College Cork Ireland  From elman at crl.ucsd.edu Tue Mar 3 13:14:43 1992 From: elman at crl.ucsd.edu (Jeff Elman) Date: Tue, 3 Mar 92 10:14:43 PST Subject: TR: 'Connectionism and the study of change' Message-ID: <9203031814.AA12473@crl.ucsd.edu> The following technical report is available via anonymous ftp or surface mail. Instructions on how to obtain it follow. ----------------------------------------------------------- Connectionism and the Study of Change Elizabeth A. Bates Jeffrey L. Elman Center for Research in Language University of California, San Diego CRL Technical Report 9202 Developmental psychology is not just about the study of children; it is also about the study of change. In this paper, we start with a brief historical review showing how the study of change has been aban- doned within developmental psychology, in favor of strong "preforma- tionist" views, of both the nativist variety (i.e. mental structures are present at birth, or they mature along a predetermined schedule) and the empiricist variety (i.e. mental structures are the "internali- zation" of social interactions with a competent adult). From either point of view, nothing really new can emerge across the course of development. These perspectives contrast with the truly interaction- ist view espoused by Jean Piaget, who argued for an emergence of new mental structures at the interface between organism and environment. We argue that these historical trends (including a widespread repudiation of Piaget) are rooted in the First Computer Metaphor, i.e. use of the serial digital computer as a metaphor for mind. Aspects of the First Computer Metaphor that have led to this kind of preforma- tionism include (1) discrete representations, (2) absolute rules, (3) learning as programming and/or selection from an array of pre- established hypotheses, and (4) the hardware/software distinction. We go on to argue that connectionism (i.e. the Second Computer Metaphor) offers a way out of the Nature-Nurture impasse in developmental psychology, providing useful formalizations that capture the elusive notion of emergent form. In fact, connectionist models can be used to salvage certain forms of developmental nativism, offering concrete insights into what an innate idea might really look like (or better yet, 50% of an innate idea, awaiting completion through learning). ----------------------------------------------------------- If you have access to the internet, you may obtain a copy of this report by doing the following: unix% ftp crl.ucsd.edu /* or: ftp 128.54.16.43 */ Connected to crl.ucsd.edu. 220 crl FTP server (SunOS 4.1) ready. Name (crl.ucsd.edu:elman): anonymous 331 Guest login ok, send ident as password. Password: 230 Guest login ok, access restrictions apply. ftp> binary 200 Type set to I. ftp> cd pub/neuralnets 250 CWD command successful. ftp> get tr9202.ps.Z 200 PORT command successful. 150 Binary data connection for tr9202.ps.Z (128.54.16.43,1507) (98903 bytes). 226 Binary Transfer complete. local: tr9202.ps.Z remote: tr9202.ps.Z 98903 bytes received in 0.57 seconds (1.7e+02 Kbytes/s) ftp> quit 221 Goodbye. unix% zcat tr9202.ps.Z | lpr If you are unable to obtain the TR in this manner, you may request a hardcopy by sending email to staight at crl.ucsd.edu (include your postal address).  From emelz at cognet.ucla.edu Tue Mar 3 14:39:32 1992 From: emelz at cognet.ucla.edu (Eric Melz) Date: Tue, 03 Mar 92 11:39:32 -0800 Subject: Composite networks In-Reply-To: Your message of Mon, 02 Mar 92 14:45:00 +0000. <01GH682VYUB400105W@IRUCCVAX.UCC.IE> Message-ID: <9203031939.AA10914@tinman.cognet.ucla.edu> >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? I just submitted a paper to the upcoming Cognitive Science Conference entitled "Developing Microfeatures by Analogy". The paper describes how a localist constraint-satisfaction model of analogical mapping can be used to guide the formation of the internal representations developed by backpropagation. Applied to Hinton's (1986) model of family-tree learning, this hybrid model exhibits near-perfect generalization performance when as many as 25% of the training cases are removed from the training corpus, compared to near 0% generalization when only backprop is used. If you (or anyone else) are interested in obtaining a hardcopy of this paper, send email directly to me. Eric Melz UCLA Department of Psychology  From pablo at cs.washington.edu Tue Mar 3 15:22:12 1992 From: pablo at cs.washington.edu (David Cohn) Date: Tue, 3 Mar 92 12:22:12 -0800 Subject: Composite networks Message-ID: <9203032022.AA21552@june.cs.washington.edu> P. J. Hampson (STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu) asks: > I am interested in modelling tasks in which invariant information from > previous input-output pairs is brought to bear on the acquisition of current > input-output pairs. Thus I want to use previously extracted regularity to > influence current processing. ... I'm not sure I've read the intent of the posting correctly, but this sounds like it may be able to draw on some of the recent work in "active" learning systems. These are learning systems that have some control over what their inputs will be. A common form of active learning is that of "learning by queries," where a neural network "asks for" new training examples from some part of a domain based on its evaluation of of previous training examples (e.g. Cohn et al., Baum and Lang, Hwang et al. and MacKay). > At year's memory conference in Lancaster (England) Dave Rumelhart mentioned > the need to develop nets which incorporate a distinction between learning > and memory and exploit the attributes of both. ... The distinction usually made in querying is a bit different, and consists of a loop iterating between sampling and learning. With a neural network, the "learning" generally consists of simply training on the sampled data, and thus could be thought of as analogous to stuffing the data into "memory." The driving algorithm directs the sampling based on previously learned data to optimize the utility of new examples. This approach has been tried on a number of relatively complicated, yet still "toy" problems; current efforts are to overcome the representational and computational complexity problems that arise as one makes the transition to the domain of "real world" problems like speech. -David Cohn e-mail: pablo at cs.washington.edu Dept. of Computer Science & Eng. phone: (206) 543-7798 University of Washington, FR-35 Seattle, WA 98195 References: Cohn, Atlas & Ladner. (1990) Training Connectionist Networks with Queries and Selective Sampling. In D. Touretzky, ed., "Advances In Neural Info. Processing 2" Baum and Lang. (1991) Constructing Hidden Units using Examples and Queries. In Lippmann et al., eds., "Advances In Neural Info. Processing 3" Hwang, Choi, Oh and Marks. (1990) Query learning based on boundary search and gradient computation of trained multilayer perceptrons. In Proceedings, IJCNN 1990. MacKay. (1991) Bayesian methods for adaptive models. Ph.D. thesis, Dept. of Computation and Neural Systems, California Institute of Technology.  From pratt at cs.rutgers.edu Tue Mar 3 17:46:38 1992 From: pratt at cs.rutgers.edu (pratt@cs.rutgers.edu) Date: Tue, 3 Mar 92 17:46:38 EST Subject: Composite networks Message-ID: <9203032246.AA14834@binnacle.rutgers.edu> P.J. Hampson asks: >From STAY8026 at iruccvax.ucc.ie Mon Mar 2 09:45:00 1992 >Subject: Composite networks >Hi, > >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? >... Papers on how networks can be constructed modularly (source and target have different topologies, are responsible for different classes) include: [Waibel et al., IEEASSP], [Pratt & Kamm, IJCNN91], [Pratt et al., AAAI91], [Pratt, CLNL92]. The working title and abstract for my upcoming PhD thesis (to be described in [Pratt, 1992b]) are as follows: Transferring Previously Learned Back-Propagation Neural Networks to New Learning Tasks Lori Pratt Neural network learners traditionally extract most of their information from a set of training data. If training data is in short supply, the learned classifier may perform poorly. Although this problem can be addressed partly by carefully choosing network parameters, this process is ad hoc and requires expertise and manual intervention by a system designer. Several symbolic and neural network inductive learners have explored how a domain theory which supplements training data can be automatically incorporated into the training process to bias learning. However, research to date in both fields has largely ignored an important potential knowledge source: classifiers that have been trained previously on related tasks. If new classifiers were able to build directly on previous results, then training speed, performance, and the ability to effectively utilize small amounts of training data could potentially be substantially improved. This thesis introduces the problem of {\em transfer} of information from a trained learner to a new learning task. It also presents an algorithm for transfer between neural networks. Empirical results from several domains demonstrate that this algorithm can improve learning speed on a variety of tasks. This will be published in part as [Pratt, 1992]. --Lori -------------------------------------------------------------------------------- References: @article{ waibel-89b, MYKEY = " waibel-89b : .bap .unr .unb .tem .spc .con ", TITLE = "Modularity and Scaling in Large Phonemic Neural Networks", AUTHOR = "Alexander Waibel and Hidefumi Sawai and Kiyohiro Shikano", journal = "IEEE Transactions on Acoustics, Speech, and Signal Processing", VOLUME = 37, NUMBER = 12, MONTH = "December", YEAR = 1989, PAGES = {1888-1898} } @inproceedings{ pratt-91, MYKEY = " pratt-91 : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt and Jack Mostow and Candace A. Kamm", TITLE = "{Direct Transfer of Learned Information among Neural Networks}", BOOKTITLE = "Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91)", PAGES = {584--589}, ADDRESS = "Anaheim, CA", YEAR = 1991, } @inproceedings{ pratt-91b, MYKEY = " pratt-91b : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt and Candace A. Kamm", TITLE = "Improving a Phoneme Classification Neural Network through Problem Decomposition", YEAR = 1991, MONTH = "July", BOOKTITLE = "Proceedings of the International Joint Conference on Neural Networks (IJCNN-91)", ADDRESS = "Seattle, WA", PAGES = {821--826}, ORGANIZATION = "IEEE", } @incollection{ pratt-92, MYKEY = " pratt-92 : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt", TITLE = "Experiments on the Transfer of Knowledge Between Neural Networks", BOOKTITLE = "Computational Learning Theory and Natural Learning Systems, Constraints and Prospects", EDITOR = "S. Hanson and G. Drastal and R. Rivest", YEAR = 1992, PUBLISHER = "MIT Press", CHAPTER = "4.1", NOTE = "To appear", } @incollection{ pratt-92b, MYKEY = " pratt-92b : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt", TITLE = "Non-literal information transfer between neural networks", BOOKTITLE = "Neural Networks: Theory and Applications {II}", EDITOR = "R.J.Mammone and Y. Y. Zeevie", YEAR = 1992, PUBLISHER = "Academic Press", NOTE = "To appear", } ------------------------------------------------------------------- L. Y. Pratt ,_~o Computer Science Department pratt at cs.rutgers.edu _-\_<, Rutgers University (*)/'(*) Hill Center (908) 932-4974 (CoRE building office) New Brunswick, NJ 08903, USA (908) 846-4766 (home)  From echown at engin.umich.edu Tue Mar 3 16:19:53 1992 From: echown at engin.umich.edu (Eric Chown) Date: Tue, 3 Mar 92 16:19:53 -0500 Subject: Article available Message-ID: <5756ff371.000b0f6@mtrans.engin.umich.edu> **DO NOT FORWARD TO OTHER GROUPS** The following paper has been added to the archive: Tracing Recurrent Activity in Cognitive Elements (TRACE): A Model of Temporal Dynamics in a Cell Assembly Stephen Kaplan, Martin Sonntag and Eric Chown Department of Electrical Engineering and Computer Science, and Department of Psychology, The University of Michigan The article appeared in Connection Science 3:179-206 (1991) Abstract: Hebb's introduction of the cell assembly concept marks the beginning of modern connectionism, yet its implications remail largely unexplored and its potential unexploited. Lately, however, promising efforts have been made to utilize recurrent connections, suggesting the timeliness of a reexamination of the cell assembly as a key element in a cognitive connectionism. Our approach emphasizes the psychological functions of activity in a cell assembly. This provides an opportunity to explore the dynamic behavior of the cell assembly considered as a continuous system, an important topic that we feel has not been given sufficient attention. A step-by-step analysis leads to an identification of characteristic temporal patterns and of necessary control systems. Each step of this analysis leads to a corresponding building block in a set of emerging equations. A series of experiments is then described that explore the implications of the theoretically derived equations in terms of the time course of activity generated by a simulation under different conditions. Finally the model is evaluated in terms of whether the various contraints deemed appropriate can be met, whether the resulting solution is robust, and whether the solution promises sufficient utility and generality. ------------------------------------------------------------------------------- This paper has been placed in the neuroprose directory in compressed form. The file name is kaplan.trace.ps.Z If your are unable to ftp this paper please address reprint requests to: Professor Stephen Kaplan Psychological Laboratories Mason Hall The University of Michigan Ann Arbor, MI 48109-1027 e-mail: Stephen_Kaplan at um.cc.umich.edu Here is how to ftp this paper: unix> ftp cheops.cis.ohio-state.edu (or 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get kaplan.trace.ps.Z ftp> quit unix> uncompress kaplan.trace.ps.Z unix> lpr kaplan.trace.ps  From inf21!finnoff at ztivax.uucp Wed Mar 4 06:48:14 1992 From: inf21!finnoff at ztivax.uucp (William Finnoff) Date: Wed, 4 Mar 92 12:48:14 +0100 Subject: BP-Stoc. Appr. Message-ID: <9203041148.AA07939@basalt> I'm looking for additional references on the relationship between pattern for pattern backpropagation and stochastic approximation/algorithms/optimization (Robins Munroe, etc.). I am aware of the following White, H., Some asymptotic results for learning in single hidden-layer feedforward network models, Jour. Amer. Stat. Ass. 84, no. 408, p. 1003-1013, (1989), White, H., Learning in artificial neural networks: A statistical perspective, Neural Computation 1, 1989, pp. 425-464, Darken C. and Moody J., Note on learning rate schedules for stochastic optimization, in Advances in Neural Information Processing Systems 3., Lippmann, R. Moody, J., and Touretzky, D., ed., Morgan Kaufmann, San Mateo, (1991), and the latest contribution of the last two authors at NIPS(91). I have a fairly good list of literature with regards to the general theory of stochastic algorithms. Three examples are listed below: Bouton C., Approximation Gaussienne d'algorithmes stochastiques a dynamique Markovienne. Thesis, Paris VI, (in French), (1985) Kushner, H.J., and Schwartz, A., An invariant measure approach to the convergence of stochastic approximations with state dependent noise. SIAM j. Control and Opt. 22, 1, p. 13-27, (1984) Metivier, M. and Priouret, P., Th'eor`emes de convergence presque-sure pour une classe d'algorithmes stochastiques `a pas d'ecroissant. Prob. Th. and Rel. Fields 74, p. 403-28, (in French), (1987). I am therefore only interested in references in which the relationship to BP is explicit. Any help in this matter will be appreciated.  From georg at ai.univie.ac.at Wed Mar 4 11:57:45 1992 From: georg at ai.univie.ac.at (Georg Dorffner) Date: Wed, 4 Mar 1992 17:57:45 +0100 Subject: Symposium announcement Message-ID: <199203041657.AA07770@chicago.ai.univie.ac.at> Announcement Symposium on CONNECTIONISM AND COGNITIVE PROCESSING as part of the Eleventh European Meeting on Cybernetics and Systems Research (EMCSR) April 21 - 24, 1992 University of Vienna, Austria Chairs: Noel Sharkey (Univ. of Exeter) Georg Dorffner (Univ. of Vienna) The following papers will be presented: Thursday afternoon (Apr. 23) Conflict Detection in Asynchronous Winner-Take-All Structures M.Deng, Penn State Univ., USA Weightless Neurons and Categorisation Modelling M.H.Gera, Univ. of London, UK Semantic Transitions in a Hierarchical Memory Network M.Herrmann, M.Usher, Tel Aviv Univ., ISR Type Generalisations on Distributed Representations D.Mundi, N.E.Sharkey, Univ. of Exeter, UK Non-Conceptual Content and Parallel Distributed Processing, a Match Made in Cognitive Science Heaven? R.L.Chrisley, Univ. of Oxford, UK Aspects of Rules and Connectionism E.Prem, Austrian Research Inst. for AI, A Friday Morning (April 24) Sub-symbolic Inference: Inferring Verb Meaning A.Baldwin, Univ. of Exeter, UK Mental Models in Connectionist Networks V.Ajjanagadde, Univ. of Tuebingen, D Connectionism and the Issue of Compositionality and Systematicity L.Niklasson, N.E.Sharkey, Univ. of Skoevde, S Friday Afternoon INVITED LECTURE: The Causal Role of the Constituents of Superpositional Representations N.E.Sharkey, Univ. of Exeter, UK followed by a moderated discussion with all previous presenters Section Neural Networks EMG/EEG Pattern Recognition by Neural Networks A.Hiraiwa, N.Uchida, K.Shimohara, NTT Human Interface Labs., J Simulation of Navigation of Mobile Robots with Non-Centralized Neuromorphic Control L.F.B. Almeida, E.P.L.Passos, Inst.Militar de Engenharia, BRA Weightless and Threshold-Controlled Neurocomputing O.Kufudaki, J.Horejs, Czechoslovak Acad.of Sciences, CS Neural Networks Learning with Genetic Algorithms P.M.Palagi, L.A.V. de Carvalho, Univ.Fed.do Rio de Janeiro, BRA - * - Among the plenary lectures of the conference, Tuesday morning will feature Fuzzy Logic, Neural Networks and Soft Computing L. Zadeh, UC Berkeley, USA Furthermore, the conference will include symposia on the following topics: - General Systems Methodology - Mathematical Systems Theory - Computer Aided Process Interpretation - Fuzzy Sets, Approximate Reasoning and Knowledge-Based Systems - Designing and Systems - Humanity, Architecture and Conceptualization - Biocybernetics and Mathematical Biology - Cybernetics in Medicine - Cybernetics of Socio-Economic Systems - Systems, Management and Organization - Cybernetics of National Development - Communication and Computers - Intelligent Autonomous Systems - Artificial Intelligence - Impacts of Artificial Intelligence Conference Fee: AS 2,400 for contributors, AS 3,400 for participants. (incl.proceedings (2 volumes) and two receptions; 12 AS = 1 US$ approx.). The proceedings will also be available from World Scientific Publishing Co., entitled "Cybernetics and Systems '92; R.Trappl (ed.)" Registration will be possible at the conference site (main building of the University of Vienna). You can also contact: EMCSR Conference Secretariat Austrian Society for Cybernetic Studies Schottengasse 3 A-1010 Vienna, Austria Tel: +43 1 535 32 810 Fax: +43 1 63 06 52 Email: sec at ai.univie.ac.at  From LC4A%ICINECA.BITNET at BITNET.CC.CMU.EDU Wed Mar 4 14:19:46 1992 From: LC4A%ICINECA.BITNET at BITNET.CC.CMU.EDU (F. Ventriglia) Date: Wed, 04 Mar 92 14:19:46 SET Subject: Neural Networks School Message-ID: <01GH8N7EHME8CTZB4P@BITNET.CC.CMU.EDU> Dear Sir, I would like to inform you that I am organizing an International School on Neural Modeling and Neural Networks, as moderator of this E-mail Network could you propagate the notice of this School among the subscribers of your network? Could you insert my name in the list of subscribers? Looking forward to hearing from you. My best wishes and regards. Francesco Ventriglia ================ INTERNATIONAL SCHOOL on NEURAL MODELLING and NEURAL NETWORKS Capri (Italy) - September 27th-October 9th, 1992 Director F. Ventriglia An International School on Neural Modelling and Neural Networks was organized under the sponsorship of the Italian Group of Cybernetics and Biophysics of the CNR and of the Institute of Cybernetics of the CNR; sponsor the American Society for Mathematical Biology. The purpose of the school is to give to young scientists and to migrating senior scientists some landmarks in the inflationary universe of researches in neural modelling and neural networks. Towards this aim some well known experts will give lectures in different areas comprising neural structures and functions, single neuron dynamics, oscillations in small group of neurons, statistical neurodynamics of neural networks, learning and memory. In the first part, some neurobiological foundations and some formal models of single (or small groups of) neurons will be stated. The topics will be: TOPICS LECTURERS 1. Neural Structures * Szentagothai, Budapest 2. Correlations in Neural Activity * Abeles, Jerusalem 3. Single Neuron Dynamics: deterministic models * Rinzel, Bethesda 4. Single Neuron Dynamics: stochastic models * Ricciardi, Naples 5. Oscillations in Neural Systems * Ermentrout, Pittsburgh 6. Noise in Neural Systems * Erdi, Budapest The second part will be devoted to Neural Networks, i.e. to models of neural systems and of learning and memory. The topics will be: TOPICS LECTURERS 7. Mass action in Neural Systems * Freeman, Berkeley 8. Statistical Neurodynamics: kinetic approach * Ventriglia, Naples 9. Statistical Neurodynamics: sigmoidal approach * Cowan, Chicago 10.Attractor Neural Networks in Cortical Conditions * Amit, Roma 11."Real" Neural Network Models * Traub, Yorktown Heights 12.Pattern Recognition in Neural Networks * Fukushima, Osaka 13.Learning in Neural Networks * Tesauro, Yorktown Heights WHO SHOULD ATTEND Applicants for the international School should be actively engaged in the fields of biological cybernetics, biomathematics or computer science, and have a good background in mathematics. As the number of participants must be limited to 70, preference may be given to students who are specializing in neural modelling and neural networks and to professionals wha are seeking new materials for biomathematics or computer science courses. SCHOOL FEES The school fee is Italian Lire 500.000 and includes notes, lunch and coffee- break for the duration of the School. REGISTRATION A limited number of grants (covering the registration fee of Lit. 500.000) is available. The organizator applied to the Society of Mathematical Biology for travel funds for participants who are member of the SMB. Preference will be given to students, postdoctoral fellows and young faculty (1-2 years) after PhD. PROCEDURE FOR APPLICATION Applicants should contact: Dr. F. Ventriglia Registration Capri International School Istituto di Cibernetica Via Toiano 6 80072 - Arco Felice (NA) Italy Tel. (39-) 81-8534 138 E-Mail LC4A at ICINECA (bitnet) Fax (39-) 81-5267 654 Tx 710483  From patkins at laurel.ocs.mq.edu.au Thu Mar 5 12:09:48 1992 From: patkins at laurel.ocs.mq.edu.au (Paul Atkins) Date: Thu, 5 Mar 92 12:09:48 EST Subject: Ability to generalise over multiple training runs Message-ID: <9203050209.AA16943@laurel.ocs.mq.edu.au> In their recent technical report, Bates & Elman note that "it is possible for several different networks to reach the same solution to a problem, each with a totally different set of weights." (p. 13b) I am interested in the relationship between this phenomenon and the measurement of a network's ability to generalise. From singh at envy.cs.umass.edu Wed Mar 4 10:33:27 1992 From: singh at envy.cs.umass.edu (singh@envy.cs.umass.edu) Date: Wed, 4 Mar 92 10:33:27 -0500 Subject: Composite Networks Message-ID: <9203041533.AA11969@gluttony.cs.umass.edu> Hi! ***This is in repsonse to P. J. Hampson's message about composite networks.*** >From STAY8026 at iruccvax.ucc.ie Mon Mar 2 09:45:00 1992 >Subject: Composite networks >Hi, > >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? >... I have studied learning agents that have to learn to solve MULTIPLE sequential decision tasks (SDTs) in the same external environment. Specifically, I have looked at reinforcement learning agents that have to solve a set of compositionally-structured sequential decision tasks. E.g., consider a navigation environment ( a robot in a room): Task 1: Go to location A optimally. Task 2: Go to location B optimally. Task 3: Go to location A and then to B optimally. Tasks 1 and 2 are 'elemental' SDTs and Task 3 is a 'composite' SDT. I have studied two different ways of achieving the obvious kind of TRANSFER of LEARNING across such a set of tasks. I am going to (try and) be brief - anyone interested in further discussion or my papers can contact me individually. Method 1: ********* I have used a modified Jacobs-Jordan-Nowlan-Hinton ``mixture of expert modules'' network with Watkin's Q-learning algorithm to construct a mixture of ``adaptive critics'' that learns the elemental tasks in separate modules and then the gating module learns to sequence the correct elemental modules to solve the composite tasks. Note, that the representation of the tasks is not ``linguistic'' and therefore the agent cannot simply ``parse'' the composite task representation to determine which elemental modules to sequence. The decomposition has to be discovered by trial-and-error. Transfer of learning is achieved by sharing the solution of previously acquired elemental tasks across multiple composite tasks. Sequential decision tasks are particularly difficult to learn to solve because there is no supervised target information, only a success/failure reponse at the end of the task. Ref: --- @InProceedings{Singh-NIPS4, author = "Singh,S.P.", title = "On the efficient learning of multiple sequential tasks", booktitle = "Advances in Neural Information Processing Systems 4", year = "1992", editor = "J.E. Moody and S.J. Hanson and R.P. Lippman", OPTpages = "", OPTorganization = "", publisher = "Morgan Kauffman", address = "San Mateo, CA", OPTmonth = "", OPTnote = "Oral"} @Article{Singh-MLjournal, author = "Singh,S.P.", title = "Transfer of Learning by Composing Solutions for Elemental Sequential Tasks", journal = "Machine Learning", year = "1992", OPTvolume = "", OPTnumber = "", OPTpages = "", OPTmonth = "", OPTnote = "to appear"} @phdthesis{Watkins-thesis, author="C. J. C. H. Watkins", title="Learning from Delayed Rewards", school="Cambridge Univ.", address="Cambridge, England", year=1989} @Article{Jacobs-Jordan-Nowlan-Hinton, author = "R. A. Jacobs and M. I. Jordan and S. J. Nowlan and G. E. Hinton", title = "Adaptive Mixtures of Local Experts", journal = "Neural Computation", year = "1991", volume = "3", number = "1", OPTpages = "", OPTmonth = "", OPTnote = "" } Method 2: ********* Method 1 did not learn models of the environment. For learning to solve a single SDT it is not always clear that the considerable expense of doing system identification is warranted (Barto and Singh, Gullapalli), however if an agent is going to solve multiple tasks in the same environment it is almost certainly going to be useful. I consider a hierarchy of world-models, where the ``actions/operators'' for upper level models are the policies for tasks lower in the hierarchy. I prove that for compositionally-structured tasks, doing Dynamic Programming in such upper level models leads to the same solutions as doing it in the real world - only it is much faster since the actions of the upper level world models ignore much TEMPORAL DETAIL. Ref: **** @InProceedings{Singh-AAAI92, author = "Singh, S.P.", title = "Reinforcement learning with a hierarchy of abstract models", booktitle = "Proceedings of the Tenth National Conference on Artificial Intelligence", year = "1992", OPTeditor = "", OPTpages = "", OPTorganization = "", OPTpublisher = "", address = "San Jose,CA", OPTmonth = "", note = "Forthcoming" } @InProceedings{Singh-ML92, author = "Singh, S.P.", title = "Scaling reinforcement learning algorithms by learning variable temporal resolution models", booktitle = "Proceedings of the Machine Learning Conference, 1992", year = "1992", OPTeditor = "", OPTpages = "", OPTorganization = "", OPTpublisher = "", OPTaddress = "", OPTmonth = "", OPTnote = "to appear" } @inproceedings{Barto-Singh, title="On the Computational Economics of Reinforcement Learning", author="Barto, A.G. and Singh,S.P.", booktitle="Proceedings of the 1990 Connectionist Models Summer School", year="Nov. 1990", address="San Mateo, CA", editors="Touretzsky, D.S. and Elman, J.L. and Sejnowski, T.J. and Hinton, G.E.", publisher="Morgan Kaufmann", status="Read"} @incollection{Gullapalli, author="V. Gullapalli", title="A Comparison of Supervised and Reinforcement Learning Methods on a Reinforcement Learning Task", booktitle = "Proceedings of the 1991 {IEEE} Symposium on Intelligent Control", address="Arlington, VA", year="1991"} satinder. satinder at cs.umass.edu  From robtag at udsab.dia.unisa.it Wed Mar 4 09:59:26 1992 From: robtag at udsab.dia.unisa.it (Tagliaferri Roberto) Date: Wed, 4 Mar 92 15:59:26 +0100 Subject: Post-doc Positions at IIASS Message-ID: <9203041459.AA27538@udsab.dia.unisa.it> I.I.A.S.S. International Institute for Advanced Scientific Studies Vietri sul mare (Salerno) Italy The IIASS's main research interests lie in the areas of Neural Networks, Machine Learning, Computer Science and Theoretical Physics. The Institute works in close cooperation with the Departments of Computer Science and Theoretical Physics of the nearby University of Salerno. It calls for applicants for the following POSTDOCTORAL RESEARCH POSITIONS 1. Theory and Applications of Hybrid Systems. 2. Analysis of Monodimensional Signals, mainly ECG and Continuous Speech with Neural Nets. 3. Pattern Recognition and Machine Vision. 4. Machine Learning and Robotics. Salary will be according to Italian standards. The candidates should possess a Ph.D. or, at least, four years of documented scientific research activity in the indicated areas. Age limit: 35 years. Applications should include the following elements: - curriculum vitae: academic career, last position held, detailed documentation of scientific activity. - 2 letters of presentation. - Specification of present and proposed research activity of the applicant. Applications must be addressed to the President of IIASS Professor E.R. Caianiello IIASS via G. Pellegrino, 19 I-84019 Vietri sul mare (Salerno) Italy Deadline: May 31, 1992. For informations contact: Dr. Roberto Tagliaferri Dept. Informatica ed Applicazioni Universita' di Salerno I-84081 Baronissi (Salerno) Italy tel. +39 89 822263 fax. +39 89 822275\2 e-mail address: robtag at udsab.dia.unisa.it  From tenorio at ecn.purdue.edu Thu Mar 5 17:00:40 1992 From: tenorio at ecn.purdue.edu (tenorio@ecn.purdue.edu) Date: Thu, 5 Mar 1992 16:00:40 -0600 Subject: Ability to generalise over multiple training runs Message-ID: <9203052051.AA01286@dynamo.ecn.purdue.edu> >In their recent technical report, Bates & Elman note that "it is possible >for several different networks to reach the same solution to a problem, each >with a totally different set of weights." (p. 13b) I am interested in the >relationship between this phenomenon and the measurement of a network's >ability to generalise. I have not seen the report yet, and I don't understand the assumptions made to get at this conclusion, but if the solution is defined as some quality of approximation of a finite number of points in the training or testing set, then I contend that there are a very large, possibly infinite networks that would yield to the same "solution." This argument is true if the networks are made with different architectures, or if a single architecture is chosen; if it is a classification or interpolation; and if the weights are allowed to be real valued or not. A simple modification on the input variable order, or the presentation order, or the functions of the nodes, or the initial points, or the number of hidden nodes would lead to different nets. The only way to talk about the "optimum weights" (for a fixed architecture in all respects) is if the function is defined in EVERY possible point. For classification tasks for example, how many ways can a closed contour be defined with hyperplanes? Or in interpolation, how many functions perfectly visit the data points, yet can do wildly different things in the "don't care" points? Therefore, a function defined by a finite number of points can be represented by an equivalent family of functions within an epsilon of error, regardless of how big the finite set is. > >>From the above I presume (possibly incorrectly) that, if there are many >possible solutions, then some of them will work well for new inputs and >others will not work well. So on one training run a network may appear to >generalise well to a new input set, while on another it does not. Does this >mean that, when connectionists refer to the ability of a network to >generalise, they are referring to an average ability over many trials? Has >anyone encountered situations in which the same network appeared to >generalise well on one learning trial and poorly on another? > This issue has come up in the network about a couple of weeks ago in a discussion about regularization and network training. it has to do with the power to express the function given the network (has the network more or less degrees of freedom that needed) and a limited number of points and the fact that these points can be noisy, and a poor representation of the function itself. Things can get even more hectic if the training set is not a faithful representation of the distribution of the function because of the way it was designed. I'll let the people that published reports on these and contributed on the discussion contact you directly with their views. >Reference: >Bates, E.A. & Elman, J.L. (1992). Connectionism and the study > of change. CRL Technical Report 9202, (February). > >-- >Paul Atkins email: patkins at laurel.mqcc.mq.oz.au >School of Behavioural Sciences phone: (02) 805-8606 >Macquarie University fax : (02) 805-8062 >North Ryde, NSW, 2113 >Australia. < Manoel Fernando Tenorio > < (tenorio at ecn.purdue.edu) or (..!pur-ee!tenorio) > < MSEE233D > < Parallel Distributed Structures Laboratory > < School of Electrical Engineering > < Purdue University > < W. Lafayette, IN, 47907 > < Phone: 317-494-3482 Fax: 317-494-6440 >  From cherwig at eng.clemson.edu Sun Mar 8 03:19:24 1992 From: cherwig at eng.clemson.edu (christoph bruno herwig) Date: Sun, 8 Mar 92 03:19:24 EST Subject: linear separability Message-ID: <9203080819.AA27836@eng.clemson.edu> I was wondering if someone could point me in the right direction concerning the following fundamental separability problem: Given a binary (-1/1) valued training set consisting of n-dimensional input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias term as the n-th dimension) and 1-dimensional target vectors. For this 2-class classification problem I wish to prove (non-) linear separability solely on the basis of the given training set (hence determine if the underlying problem may be solved with a 2-layer feedforward network). My approach so far: Denote input vectors of the first (second) class with H1 (H2). We need to find a hyperplane in n-dimesional space separating vectors of H1 and H2. The hyperplane goes through the origin of the n-dimensional hypercube. Negate the vectors of H2 and the problem now states: Find the hyperplane with all vectors (from H1 and the negated of H2) on one side of the plane and none on the other. Pick one of the vectors (= a vertex of the hypercube) and compute the Hamming Distance to all other vectors. Clearly, there must be a constraint on how many of the vectors are allowed to be how far "away" in order for the hyperplane to be able to separate. E.g.: If any of the Hamming Distances would be n, then the training set would not be linearly separable. My problem is concerning the constraints for Distances of n-1, n-2, etc... Has anyone taken a similar approach? Are there alternate solutions to linear separability? I tried linear programming, but the simplex algorithm uses a prohibitive number of variables for a mid-size problem. Your help is greatly appreciated, Christoph.  From alexis at CS.UCLA.EDU Sun Mar 8 23:45:38 1992 From: alexis at CS.UCLA.EDU (Alexis Wieland) Date: Sun, 8 Mar 92 20:45:38 -0800 Subject: linear separability In-Reply-To: christoph bruno herwig's message of Sun, 8 Mar 92 03:19:24 EST <9203080819.AA27836@eng.clemson.edu> Message-ID: <9203090445.AA06949@oahu.cs.ucla.edu> > From: christoph bruno herwig > Given a binary (-1/1) valued training set consisting of n-dimensional > input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias > term as the n-th dimension) and 1-dimensional target vectors. For this 2-class > classification problem I wish to prove (non-) linear separability > solely on the basis of the given training set (hence determine if > the underlying problem may be solved with a 2-layer feedforward network). > My approach so far: > Denote input vectors of the first (second) class with H1 (H2). We need > to find a hyperplane in n-dimesional space separating vectors of H1 > and H2. The hyperplane goes through the origin of the n-dimensional > hypercube. ... Your analysis relies on the dividing hyperplane passing through the origin, a condition that you dutifully state. But this need not be the case for linearly separable problems. Consider the simple 2D case with the four points (1,1), (1,-1), (-1,1), (-1,-1). Place one point in H1 and the rest in H2. The problem is clearly linearly separable, but there is no line that passes through the origin that will serve. The min distance from the origin to the dividing hyperplane for a class with one element increases with dimension of the input. - alexis. (Mr. Spiral :-)  From dlovell at s1.elec.uq.oz.au Mon Mar 9 18:52:48 1992 From: dlovell at s1.elec.uq.oz.au (David Lovell) Date: Mon, 9 Mar 92 18:52:48 EST Subject: linear separability Message-ID: <9203090853.AA20175@c10.elec.uq.oz.au> Sorry this is going out to everyone, I couldn't find a useable email address for Christoph. # #Given a binary (-1/1) valued training set consisting of n-dimensional #input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias #term as the n-th dimension) and 1-dimensional target vectors. For this 2-class #classification problem I wish to prove (non-) linear separability #solely on the basis of the given training set (hence determine if #the underlying problem may be solved with a 2-layer feedforward network). Actually, you only need one neuron to seperate that set of vectors (if, of course, they are linearly seperable). # #My approach so far: #Denote input vectors of the first (second) class with H1 (H2). We need #to find a hyperplane in n-dimesional space separating vectors of H1 #and H2. The hyperplane goes through the origin of the n-dimensional #hypercube. Negate the vectors of H2 and the problem now states: Find the #hyperplane with all vectors (from H1 and the negated of H2) on one side #of the plane and none on the other. Pick one of the vectors (= a vertex of #the hypercube) and compute the Hamming Distance to all other vectors. #Clearly, there must be a constraint on how many of the vectors are #allowed to be how far "away" in order for the hyperplane to be able to #separate. E.g.: If any of the Hamming Distances would be n, then the #training set would not be linearly separable. #My problem is concerning the constraints for Distances of n-1, n-2, #etc... # #Has anyone taken a similar approach? A distressingly large number of people in the 60's actually. (I've been beating my head against that problem for the past 6 months). #Are there alternate solutions to #linear separability? I tried linear programming, but the simplex #algorithm uses a prohibitive number of variables for a mid-size #problem. # There is a fundamental result concerning the sums of all TRUE vectors and all FALSE vectors, but it doesn't work with only a sampling of those vectors. Here come a list of papers that deal with the problem as best as they can: W.H. Highleyman, "A note on linear separation", IRE Trans. Electronic Computers, EC-10, 777-778, 1961. R.C. Singleton, "A test for linear separability applied to self-organizing machines," in Self-Organizing Systems, M.C. Yovitts, Ed., 503-524, 1961. The Perceptron Convergence Procedure, Ho-Kayshap procedure and TLU synthesis techniques by Dertouzos or Kaszerman will not converge in the non-LS case, see: Perceptron Convergence Procedure R.O.Duda & P.E. Hart, " Pattern Classification and Scene Analysis," New York, John Wiley and Sons, 1973. Ho-Kayshap ibid TLU Dertouzos M.L. Dertouzos, "Threshold Logic: A synthesis Approach", MIT Research Monograph), vol 32. Cambridge, MA, MIT Press, 1965. TLU Kaszerman P. Kaszerman, "A geometric Test Synthesis Procedure for a threshold Device", Information and Control, 6, 381-393, 1963. I'm not sure, but I think those last two require the functions to be completely specified. The approach that I was looking at is as follows. What we are really interested in is the set of between n and n Choose Floor(n/2) vectors that closest to the separating hyperplane (if there is one). This is the union of the minimally TRUE and maximally FALSE vectors. If we can show that this set of vectors lies on one side of some hyperplane then the problem is solved. The reason that the problem is not solved is that I haven't yet been able to find a method for determining these vectors without having the problem completely specified! #Your help is greatly appreciated, # I could use a bit of help too. Anyone interested? Please mail me if you want or can supply more info. Don't forget to send your email address so that we can continue our discussion in private if it does not prove to be of general interest to the readership of connectionists. Happy connectionisming. -- David Lovell - dlovell at s1.elec.uq.oz.au | | Dept. Electrical Engineering | "Oh bother! The pudding is ruined University of Queensland | completely now!" said Marjory, as BRISBANE 4072 | Henry the daschund leapt up and Australia | into the lemon surprise. | tel: (07) 365 3564 |  From sankar at mbeya.research.att.com Mon Mar 9 09:34:29 1992 From: sankar at mbeya.research.att.com (ananth sankar) Date: Mon, 9 Mar 92 09:34:29 EST Subject: linear separability Message-ID: <9203091434.AA28244@klee.research.att.com> Alexis says with regard to Christoph's message... >Your analysis relies on the dividing hyperplane passing through the >origin, a condition that you dutifully state. But this need not be >the case for linearly separable problems. Consider the simple 2D case >with the four points (1,1), (1,-1), (-1,1), (-1,-1). Place one point >in H1 and the rest in H2. The problem is clearly linearly separable, >but there is no line that passes through the origin that will serve. Christoph also had stated that his n-dimensional input vector consisted of n-1 inputs and a constant input of 1. Thus the search for a solution for "a" and "b" so that a.x > b for all x is now transformed to solving a.x - b > 0 for all x, where (a,b) is a hyperplane passing thru the origin. Your example above has a hyperplane thru origin in 3-space solution if you add an additional input of 1 to each vector. --Ananth  From UBTY003 at cu.bbk.ac.uk Mon Mar 9 07:12:00 1992 From: UBTY003 at cu.bbk.ac.uk (Martin Davies) Date: Mon, 9 Mar 92 12:12 GMT Subject: European Society for Philosophy and Psychology Message-ID: ****** EUROPEAN SOCIETY FOR PHILOSOPHY AND PSYCHOLOGY ****** *********** INAUGURAL CONFERENCE *********** **** 17 - 19 JULY, 1992 **** The Inaugural Conference of the European Society for Philosophy and Psychology will take place in Louvain (Leuven) Belgium, from Friday 17 to Sunday 19 July, 1992. The goal of the Society is 'to promote interaction between philosophers and psychologists on issues of common concern'. The programme for this inaugural meeting will comprise invited lectures - by Dan Sperber and Larry Weiskrantz - and invited symposia. Topics for symposia include: Intentionality, Reasoning, Connectionist Models, Consciousness, Theory of Mind, and Philosophical Issues from Linguistics. There will also be a business meeting to inaugurate the Society formally. The conference will be held in the Institute of Philosophy, University of Louvain. The first session will commence at 3.00 pm on Friday 17 July, and the conference will end at lunchtime on Sunday 19 July. Accommodation at various prices in hotels and student residences will be available. To receive further information about registration and accommodation, along with programme details, please contact one of the following: Daniel Andler CREA 1 rue Descartes 75005 Paris France Email: azra at poly.polytechnique.fr Martin Davies Philosophy Department Birkbeck College Malet Street London WC1E 7HX England Email: ubty003 at cu.bbk.ac.uk Beatrice de Gelder Psychology Department Tilburg University P.O. Box 90153 5000 LE Tilburg Netherlands Email: beadegelder at kub.nl Tony Marcel MRC Applied Psychology Unit 15 Chaucer Road Cambridge CB2 2EF England Email: tonym at mrc-apu.cam.ac.uk ****************************************************************  From geiger at medusa.siemens.com Mon Mar 9 12:26:32 1992 From: geiger at medusa.siemens.com (Davi Geiger) Date: Mon, 9 Mar 92 12:26:32 EST Subject: Call for Workshops Message-ID: <9203091726.AA06684@medusa.siemens.com> CALL FOR WORKSHOPS NIPS*92 Post-Conference Workshops December 4 and 5, 1992 Vail, Colorado Request for Proposals Following the regular NIPS program, workshops on current topics in Neural Information Processing will be held on December 4 and 5, 1992, in Vail, Colorado. Proposals by qualified individuals interested in chairing one of these workshops are solicited. Past topics have included: Computational Neuroscience; Sensory Biophysics; Recurrent Nets; Self-Organization; Speech; Vision; Rules and Connectionist Models; Neural Network Dynamics; Computa- tional Complexity Issues; Benchmarking Neural Network Applica- tions; Architectural Issues; Fast Training Techniques; Active Learning and Control; Optimization; Bayesian Analysis; Genetic Algorithms; VLSI and Optical Implementations; Integration of Neural Networks with Conventional Software. The goal of the workshops is to provide an informal forum for researchers to freely discuss important issues of current interest. Sessions will meet in the morning and in the afternoon of both days, with free time in between for ongoing individual exchange or outdoor activities. Specific open and/or controversial issues are en- couraged and preferred as workshop topics. Individuals proposing to chair a workshop will have responsibilities including: arrange brief informal presentations by experts working on the topic, moderate or lead the discussion, and report its high points, findings and conclusions to the group during evening plenary ses- sions, and in a short (2 page) written summary. Submission Pro- cedure: Interested parties should submit a short proposal for a workshop of interest postmarked by May 22, 1992. (Express mail is *not* necessary. Submissions by electronic mail will also be acceptable.) Proposals should include a title, a short descrip- tion of what the workshop is to address and accomplish, and the proposed length of the workshop (one day or two days). It should state why the topic is of interest or controversial, why it should be discussed and what the targeted group of participants is. In addition, please send a brief resume of the prospective workshop chair, a list of publications and evidence of scholar- ship in the field of interest. _M_a_i_l _s_u_b_m_i_s_s_i_o_n_s _t_o: Dr. Gerald Tesauro NIPS*92 Workshops Chair IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA (e-mail: tesauro at watson.ibm.com) Name, mailing address, phone number, and e-mail net address (if applicable) must be on all submissions. PROPOSALS MUST BE POSTMARKED BY MAY 22, 1992 Please Post  From giles at research.nj.nec.com Tue Mar 10 13:21:49 1992 From: giles at research.nj.nec.com (Lee Giles) Date: Tue, 10 Mar 92 13:21:49 EST Subject: Ability to generalise over multiple training runs Message-ID: <9203101821.AA08904@fuzzy.nj.nec.com> Regarding recent discussions on training different nets and their ability to get the same solution: We observed (Giles, et.al., in IJCNN91, NIPS4, & Neural Computation 92) similar results for recurrent nets learning small regular grammars (finite state automata) from positive and negative sample strings. Briefly, the characters of each string are presented at each time step and supervised training occurs at the end of string presentation (RTRL). [See the above papers for more information] Using random initial weight conditions and different numbers of neurons, most trained neural networks perfectly classified the training sets. Using a heuristic extraction method (there are many similar methods), a grammar could be extracted from the trained neural network. These extracted grammars were all different, but could be reduced to a unique "minimal number of states" grammar (or minimal finite state automaton). Though these experiments were for 2nd order fully recurrent nets, we've extracted the same grammars from 1st order recurrent nets using the same training data. Not all machines performed as well on unseen strings. Some were perfect on all strings tested; others weren't. For small grammars, nearly all of the trained neural networks produced perfect extracted grammars. In most cases the nets were trained on 10**3 strings and tested on randomly chosen 10**6 strings whose string length is < 99. (Since an arbitrary number of strings can be generated by these grammars, perfect generalization is not possible to test in practice.) In fact it was possible to extract ideal grammars from the trained nets that classified fairly well, but not perfectly, on the test set. [In other words, you could throw away the net and use just the grammar.} This agrees with Paul Atkins' comment: >From the above I presume (possibly incorrectly) that, if there are many >possible solutions, then some of them will work well for new inputs and >others will not work well. and with Manoel Fernando Tenorio's observation: >...then I contend that there are a very large, possibly infinite networks >architectures, or if a single architecture is chosen; if it is a >classification or interpolation; and if the weights are allowed to be real >valued or not. A simple modification on the input variable order, or the >presentation order, or the functions of the nodes, or the initial points, >or the number of hidden nodes would lead to different nets... C. Lee Giles NEC Research Institute 4 Independence Way Princeton, NJ 08540 USA Internet: giles at research.nj.nec.com UUCP: princeton!nec!giles PHONE: (609) 951-2642 FAX: (609) 951-2482  From rsun at orion.ssdc.honeywell.com Tue Mar 10 16:03:05 1992 From: rsun at orion.ssdc.honeywell.com (Ron Sun) Date: Tue, 10 Mar 92 15:03:05 CST Subject: No subject Message-ID: <9203102103.AA17546@orion.ssdc.honeywell.com> Paper announcement: ------------------------------------------------------------------ Beyond Associative Memories: Logics and Variables in Connectionist Models Ron Sun Honeywell SSDC 3660 Technology Drive Minneapolis, MN 55418 abstract This paper demonstrates the role of connectionist (neural network) models in reasoning beyond that of an associative memory. First we show that there is a connection between propositional logics and the weighted-sum computation customarily used in connectionist models. Specifically, the weighted-sum computation can handle Horn clause logic and Shoham's logic as special cases. Secondly, we show how variables can be incorporated into connectionist models to enhance their representational power. We devise solutions to the connectionist variable binding problem to enable connectioninst networks to handle variables and dynamic bindings in reasoning. A new model, the Discrete Neuron formalism, is employed for dealing with the variable binding problem, which is an extension of the weighted-sum models. Formal definitions are presented, and examples are analyzed in details. To appear in: Information Sciences, special issues on neural nets and AI It is FTPable from archive.cis.ohio-state.edu in: pub/neuroprose No hardcopy available. FTP procedure: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get sun.beyond.ps.Z ftp> quit unix> uncompress sun.beyond.ps.Z unix> lpr sun.beyond.ps (or however you print postscript)  From dlovell at s1.elec.uq.oz.au Thu Mar 12 10:48:29 1992 From: dlovell at s1.elec.uq.oz.au (David Lovell) Date: Thu, 12 Mar 92 10:48:29 EST Subject: Paper on Neocognitron Training avail on neuroprose Message-ID: <9203120048.AA02305@c10.elec.uq.oz.au> The following comment (3 pages in length) has been placed in the Neuroprose archive and submitted to IEEE Transactions on Neural Networks. Any comments or questions (both of which are invited) should be addressed to the first author: dlovell at s1.elec.uq.oz.au Thanks must go to Jordan Pollack for maintaining this excellent service. Apologies if this is the second copy of this notice to reach your site but there were problems in mailing the original (ie. I don't think it got through). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% A NOTE ON A CLOSED-FORM TRAINING ALGORITHM FOR THE NEOCOGNITRON David Lovell, Ah Chung Tsoi & Tom Downs Intelligent Machines Laboratory, Department of Electrical Engineering University of Queensland, Queensland 4072, Australia In this note, a difficulty with the application of Hildebrandt's closed-form training algorithm for the neocognitron is reported. In applying this algorithm we have observed that S-cells frequently fail to respond to features that they have been trained to extract. We present results which indicate that this training vector rejection in an important factor in the overall classification performance of the neocognitron trained using Hildebrandt's procedure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% filename: lovell.closed-form.ps.Z FTP INSTRUCTIONS unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: anything ftp> cd pub/neuroprose ftp> binary ftp> get lovell.closed-form.ps.Z ftp> bye unix% zcat lovell.closed-form.ps.Z | lpr (or whatever *you* do to print a compressed PostScript file) ---------------------------------------------------------------- David Lovell - dlovell at s1.elec.uq.oz.au | | Dept. Electrical Engineering | University of Queensland | BRISBANE 4072 | Australia | | tel: (07) 365 3564 |  From smieja at jargon.gmd.de Wed Mar 11 12:13:57 1992 From: smieja at jargon.gmd.de (Frank Smieja) Date: Wed, 11 Mar 92 18:13:57 +0100 Subject: TR--Reflective Neural Network Architecture Message-ID: <9203111713.AA18184@jargon.gmd.de> The following paper has been placed in the Neuroprose archive. ******************************************************************* REFLECTIVE MODULAR NEURAL NETWORK SYSTEMS F. J. Smieja and H. Muehlenbein German National Research Centre for Computer Science (GMD) Schlo{\ss} Birlinghoven, 5205 St. Augustin 1, Germany. ABSTRACT Many of the current artificial neural network systems have serious limitations, concerning accessibility, flexibility, scaling and reliability. In order to go some way to removing these we suggest a {\it reflective neural network architecture}. In such an architecture, the modular structure is the most important element. The building-block elements are called ``\MINOS'' modules. They perform {\it self-observation\/} and inform on the current level of development, or scope of expertise, within the module. A {\it Pandemonium\/} system integrates such submodules so that they work together to handle mapping tasks. Network complexity limitations are attacked in this way with the Pandemonium problem decomposition paradigm, and both static and dynamic unreliability of the whole Pandemonium system is effectively eliminated through the generation and interpretation of {\it confidence\/} and {\it ambiguity\/} measures at every moment during the development of the system. Two problem domains are used to test and demonstrate various aspects of our architecture. {\it Reliability\/} and {\it quality\/} measures are defined for systems that only answer part of the time. Our system achieves better quality values than single networks of larger size for a handwritten digit problem. When both second and third best answers are accepted, our system is left with only 5\% error on the test set, 2.1\% better than the best single net. It is also shown how the system can elegantly learn to handle garbage patterns. With the parity problem it is demonstrated how complexity of problems may be decomposed automatically by the system, through solving it with networks of size smaller than a single net is required to be. Even when the system does not find a solution to the parity problem, because networks of too small a size are used, the reliability remains around 99--100\%. Our Pandemonium architecture gives more power and flexibility to the higher levels of a large hybrid system than a single net system can, offering useful information for higher-level feedback loops, through which reliability of answers may be intelligently traded for less reliable but important ``intuitional'' answers. In providing weighted alternatives and possible generalizations, this architecture gives the best possible service to the larger system of which it will form part. Keywords: Reflective architecture, Pandemonium, task decomposition, confidence, reliability. ******************************************************************** ---------------------------------------------------------------- FTP INSTRUCTIONS unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get smieja.reflect.ps.Z ftp> bye unix% zcat smieja.reflect.ps.Z | lpr (or whatever *you* do to print a compressed PostScript file) ---------------------------------------------------------------- -Frank Smieja  From holm at nordita.dk Thu Mar 12 05:47:29 1992 From: holm at nordita.dk (Holm Schwarze) Date: Thu, 12 Mar 92 11:47:29 +0100 Subject: No subject Message-ID: <9203121047.AA07386@norsci0.nordita.dk> ** DO NOT FORWARD TO OTHER GROUPS ** The following paper has been placed in the Neuroprose archive in file schwarze.gentree.ps.Z . Retrieval instructions follow the abstract. Hardcopies are not available. -- Holm Schwarze (holm at nordita.dk) ------------------------------------------------------------------------- GENERALIZATION IN A LARGE COMMITTEE MACHINE H. Schwarze and J. Hertz CONNECT, The Niels Bohr Institute and Nordita Blegdamsvej 17, DK-2100 Copenhagen, Denmark ABSTRACT We study generalization in a committee machine with non--overlapping receptive fields trained to implement a function of the same structure. Using the replica method, we calculate the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls off asymptotically inversely proportional to alpha, the number of training examples per weight. For binary weights we find a discontinuous transition from poor to perfect generalization followed by a wide region of metastability. Broken replica symmetry is found within this region at low temperatures. The first--order transition occurs at a lower and the metastability limit at a higher value of alpha than in the simple perceptron. ------------------------------------------------------------------------- To retrieve the paper by anonymous ftp: unix> ftp archive.cis.ohio-state.edu # (128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get schwarze.gentree.ps.Z ftp> quit unix> uncompress schwarze.gentree.ps.Z unix> lpr -P schwarze.gentree.ps ------------------------------------------------------------------------- -------  From MORCIH92%IRLEARN.UCD.IE at BITNET.CC.CMU.EDU Thu Mar 12 09:11:29 1992 From: MORCIH92%IRLEARN.UCD.IE at BITNET.CC.CMU.EDU (Michal Morciniec) Date: Thu, 12 Mar 92 14:11:29 GMT Subject: Internal representations Message-ID: <01GHJVGN5RV4D3YTH2@BITNET.CC.CMU.EDU> I am interested in the internal representations developing in hidden units as a result of training with BP algorithm. I have done some experiments with simple ( one hidden layer with 4 nodes ) network trained to recognise hand-written digits. I found that feature detectors developed during training ( with 500 patterns ) are quite 'similar' to those reported in [1]. Interesting question arises: IF training with different databases of patterns of certain class (in this case hand-printed digits) creates similar internal representations in hidden nodes of networks with similar architecture THEN is it possible to predict what features are likely to be developed as result of training with particular patterns ? Does anybody have a knowledge of research in this area ( i.e WHY such and not other features are created ) ? 1. Martin G.L., Pittman J.A, "Recognizing Hand-Printed Letters and Digits" Thanks for comments, ===================================================================== !! Michal Morciniec !! !! !! 3 Sweetmount Pk., !! !! !! Dundrum, Dublin 14, !! !! !! !! MORCIH92 at IRLEARN.UCD.IE !! !! Eire !! !! =====================================================================  From MURRE at rulfsw.LeidenUniv.nl Thu Mar 12 16:09:00 1992 From: MURRE at rulfsw.LeidenUniv.nl (MURRE@rulfsw.LeidenUniv.nl) Date: Thu, 12 Mar 1992 16:09 MET Subject: 68 neurosimulators Message-ID: <01GHK9WIA4408WW4MH@rulfsw.LeidenUniv.nl> We have now updated and extended our table with neurosimulators to include 68 neurosimulators. We present the table below. (Sorry, for the many bytes taken by this format. We expect that this format is easier to handle by everyone.) Work on the review paper, unfortunately, has been interrupted by several events. We plan to have something available within the next few months. In this paper we will ponder on the possibility of deriving some standards for a number of the 'most popular' neural networks. If we could agree on such a set, it would be much easier to directly exchange models and simulation scripts (at least, for this limited set of neural network paradigms). Has anyone ever worked on this? If anyone wants to point out errors, fill in some blanks, or prosose to add (or remove) a system from the list, please, follow the format of the table. Additional comments (i.e., extra references to be included in the general review paper, background information, or reasons why a certain entry is wrong) may then follow the changed lines. Example: The following line in the table ought to be changed to: Name Manufacturer Hardware METANET Leiden University IBM, MAC Within 6 months from now, a MAC version will be available for this system. Adherence to this format will make it much easier for us to deal with the comments. Jacob M.J. Murre Steven E. Kleyenmberg Jacob M.J. Murre Unit of Experimental and Theoretical Psychology Leiden University P.O. Box 9555 2300 RB Leiden The Netherlands E-mail: Murre at HLERUL55.Bitnet tel.: 31-71-273631 fax.: 31-71-273619 N.B. At April 1 1992, I will start working at the following address: Jacob M.J. Murre Medical Research Council: Applied Psychology Unit 15 Chaucer Road Cambridge CB2 2EF England E-mail: jaap.murre at mrc-apu.cam.ac.uk tel.: 44-223-355294 (ext.139) fax.: 44-223-359062 Table 1.a. Neurosimulators. Name Manufacturer Hardware -------------------------------------------------------------------------------- ADAPTICS Adaptic ANNE Oregon Graduate Center Intel iPSC hypercube ANSE TRW TWR neurocom. mark 3,4,5 ANSIM SAIC IBM ANSKIT SAIC ANSPEC SIAC IBM,MAC,SUN,VAX,SIGMA/DELTA AWARENESS Neural Systems IBM AXON HNC Inc. HNC Neurocom. ANZA,ANZA+ BOSS BPS George Mason Univ., Fairfax IBM,VAX,SUN BRAIN SIMULATOR Abbot,Foster & Hauserman IBM BRAINMAKER California Scientific Software IBM CABLE Duke University VAX CASCOR CASENET COGNITRON Cognitive Software MAC,IBM CONE IBM Palo Alto IBM CONNECTIONS IBM COPS Case Western Reserve Univ. CORTEX DESIRE/NEUNET IBM EXPLORENET 3000 HNC Inc. IBM,VAX GENESIS Neural Systems IBM GENESIS/XODUS VAX,SUN GRADSIM VAX GRIFFIN Texas Instruments/Cambridge TI NETSIM neurocomputer HYPERBRAIN Neurix Inc. MAC MACBRAIN Neurix Inc. MAC MACTIVATION University of Colorado MAC METANET Leiden University IBM,(VAX) MIRRORS/II University of Maryland VAX,SUN N-NET AIWare Inc. IBM,VAX N1000 Nestor Inc IBM,SUN N500 Nestor Inc. IBM NCS North Carolina State Univ. (portable) NEMOSYS IBM RS/6000 NESTOR Nestor Inc. IBM,MAC NET NETSET 2 HNC Inc. IBM,SUN,VAX NETWURKZ Dair Computer Systems IBM NEURALSHELL Ohio State University SUN NEURALWORKS NeuralWare Inc. IBM,MAC,SUN,NEXT,INMOS NEURDS Digtal Equipment Corporation VAX NEUROCLUSTERS VAX NEURON Duke University NEUROSHELL Ward Systems Group IBM NEUROSOFT HNC Inc. NEUROSYM NeuroSym Corp. IBM NEURUN Dare research IBM NN3/SESAME GMD, Sankt Augustin, BDR SUN NNSIM OPT OWL Olmsted & Watkins IBM,MAC,SUN,VAX P3 U.C.S.D. Symbolics PABLO PDP McClelland & Rumelhart IBM,MAC PLANET University of Colorado SUN,APOLLO,ALLIANT PLATO/ARISTOTLE NeuralTech PLEXI Symbolics Inc/Lucid Inc Symbolics,SUN POPLOG-NEURAL University of Sussex SUN,VAX PREENS Nijmegen University SUN PYGMALION Esprit SUN,VAX RCS Rochester University SUN,MAC SAVY TEXT RETR. SYS. Excalibur Technologies IBM,VAX SFINX U.C.L.A. SLONN Univ. of Southern California SNNS Stuttgart University SUN,DEC,HP,IBM SUNNET SUN Table 1.b. Neurosimulators. Name Language Models Price $ -------------------------------------------------------------------------------- ADAPTICS ANNE HLL/ILL/NDL ANSE ANSIM many 495 ANSKIT ANSPEC HLL many 995 AWARENESS 275 AXON HLL 1950 BOSS BPS C bp 100 BRAIN SIMULATOR 99 BRAINMAKER Macro bp 195 CABLE HLL CASCOR CASENET Prolog COGNITRON HLL (Lisp) many 600 CONE HLL CONNECTIONS hopf 87 COPS CORTEX DESIRE/NEUNET matrix EXPLORENET 3000 GENESIS 1095 GENESIS/XODUS C GRADSIM C GRIFFIN HYPERBRAIN 995 MACBRAIN many 995 MACTIVATION METANET HLL (C) many 1000 MIRRORS/II HLL (Lisp) several N-NET C bp 695 N1000 19000 N500 NCS HLL (C++) many NEMOSYS NESTOR 9950 NET NETSET 2 many 19500 NETWURKZ 80 NEURALSHELL C many NEURALWORKS C 1495 NEURDS C NEUROCLUSTERS NEURON HLL NEUROSHELL bp 195 NEUROSOFT NEUROSYM many 179 NEURUN bp NN3/SESAME many NNSIM OPT C OWL many 1495 P3 HLL many PABLO PDP several 44 PLANET HLL many PLATO/ARISTOTLE PLEXI Lisp,C,Pascal many POPLOG-NEURAL HLL,POP-11 bp,cl PREENS HLL many PYGMALION HLL (parallel C) many RCS C SAVY TEXT RETR. SYS. C SFINX HLL SLONN SNNS HLL many SUNNET Table 1.c. Neurosimulators. Name Comments -------------------------------------------------------------------------------- ADAPTICS training software for neural-networks ANNE neural-network development environment ANSE ANSIM ANSKIT development tool for large artificial neural-networks ANSPEC AWARENESS introductory NN program AXON neural-network description language BOSS BPS BRAIN SIMULATOR BRAINMAKER neural-networks simulation software CABLE CASCOR cascade-correlation simulator CASENET graphical case-tool for generating executable code COGNITRON neural-network,prototyping,delivery system CONE research environment CONNECTIONS COPS combinatorial optimization problems CORTEX neural-network graphics tool DESIRE/NEUNET interactive neural-networks experiment environment EXPLORENET 3000 stand-alone neural-network software GENESIS neural-network development system GENESIS/XODUS general neural simulator, X-wnd. output, simulation utilities GRADSIM GRIFFIN research environment for TI NETSIM neurocomputer HYPERBRAIN MACBRAIN MACTIVATION introductory neural-network simulator METANET general neurosimulator, CAD for NN architectures MIRRORS/II neurosimulator for parallel environments N-NET integrated neural-network development system N1000 N500 NCS NEMOSYS simulation software NESTOR NET NETSET 2 NETWURKZ training tool for IBM pc NEURALSHELL NEURALWORKS neural-networks development system NEURDS NEUROCLUSTERS simulation tool for biological neural networks NEURON NEUROSHELL NEUROSOFT NEUROSYM NEURUN interactive neural-network environment NN3/SESAME neurosimulator for modular neural networks NNSIM mixed neural/digital image processing system OPT all-purpose simulator OWL P3 early PDP development system PABLO PDP introductory simulator, complements 'the PDP volumes' PLANET PLATO/ARISTOTLE knowledge processor for expert systems PLEXI flexible neurosimulator with graphical interaction POPLOG-NEURAL PREENS workbench for NN constr., visualisation,man., and simul. PYGMALION general, parallel neurosimulator under X-Windows RCS research environment, graphical neurosimulator SAVY TEXT RETRIEVAL SYSTEM SFINX research environment SLONN SNNS SUNNET Table 1.d. Neurosimulators. Name Abbreviated reference -------------------------------------------------------------------------------- ADAPTICS ANNE ANSE ANSIM [Cohen, H., Neural Network Review, 3, 102-133, 1989] ANSKIT [Barga R.S, Proc. IJCNN-90-Washington DC, 2, 94-97, 1990] ANSPEC AWARENESS [BYTE, 14(8), 244-245, 1989] AXON [BYTE, 14(8), 244-245, 1989] BOSS [Reggia J.A., Simulation, 51, 5-19, 1988] BPS BRAIN SIMULATOR BRAINMAKER [BYTE, 14(8), 244-245, 1989] CABLE [Miller J.P., Nature, 347, 783-784, 1990] CASCOR CASENET [Dobbins R.W, Proc. IJCNN-90-Wash. DC, 2, 122-125, 1990] COGNITRON [BYTE, 14(8), 244-245, 1989] CONE CONNECTIONS [BYTE, 14(8), 244-245, 1989] COPS [Takefuji Y., Science, 245, 1221-1223, 1990] CORTEX [Reggia J.A., Simulation, 51, 5-19, 1988] DESIRE/NEUNET [Korn G.A, Neural Networks, 2, 229-237, 1989] EXPLORENET 3000 [BYTE, 14(8), 244-245, 1989] GENESIS [Miller J.P., Nature, 347, 783-784, 1990] GENESIS/XODUS GRADSIM GRIFFIN HYPERBRAIN [BYTE, 14(8), 244-245, 1989] MACBRAIN [BYTE, 14(8), 244-245, 1989] MACTIVATION METANET [Murre J.M.J., Proc. ICANN-91-FIN, 1, 545-550, 1991] MIRRORS/II [Reggia, J.A., Simulation, 51, 5-19, 1988] N-NET [BYTE, 14(8), 244-245, 1989] N1000 [BYTE, 14(8), 244-245, 1989] N500 [BYTE, 14(8), 244-245, 1989] NCS NEMOSYS [Miller J.P., Nature, 347, 783-784, 1990] NESTOR NET [Reggia J.A., Simulation, 51, 5-19, 1988] NETSET 2 NETWURKZ [BYTE, 14(8), 244-245, 1989] NEURALSHELL NEURALWORKS [BYTE, 14(8), 244-245, 1989] NEURDS NEUROCLUSTERS NEURON [Miller J.P., Nature, 347, 783-784, 1990] NEUROSHELL [BYTE, 14(8), 244-245, 1989] NEUROSOFT NEUROSYM NEURUN NN3/SESAME NNSIM [Nijhuis J.L., Microproc. & Microprogr., 27,189-94, 1989] OPT OWL [BYTE, 14(8), 244-245, 1989] P3 [In: 'PDP Volume 1', MIT Press, 488-501, 1986] PABLO PDP [Rumelhart et al. 'Explorations in PDP', MIT Press, 1988] PLANET PLATO/ARISTOTLE PLEXI POPLOG-NEURAL PREENS PYGMALION RCS SAVY TEXT RETR. SYS. [BYTE, 14(8), 244-245, 1989] SFINX [Mesrobian E., IEEE Int. Conf. on Man, Sys. & Cyb., 1990] SLONN [Simulation, 55, 69-93, 1990] SNNS SUNNET Explanation of abbreviations and terms: Manufacturer: company, institute, or researchers associated with the system Languages: HLL = High Level Language (i.e., network definition language; if specific programming languages are mentioned, networks can be defined using high-level functions in these languages) Models: several = a fixed number of models is (and will be) supported many = the systems can be (or will be) extended with new models bp = backpropagation (if specific models are mentioned, these are the only ones supported by the system) hopf = hopfield cl = competitive learning Price: indication of price range in US dollars (if no price is this can either mean that the price is unknown to us, that the system is not available (yet) for general distribution, or that the system is available at a nominal charge) Comment: attempt to indicate the primary function of the system Reference: a single reference that contains pointers to the manufacturers, who may be contacted for further information (a more complete list of references, also containing review articles, etc., will appear in a general review paper by us - this paper is still in preparation and not yet available for prelimary distribution [sorry])   From rsun at orion.ssdc.honeywell.com Thu Mar 12 15:58:57 1992 From: rsun at orion.ssdc.honeywell.com (Ron Sun) Date: Thu, 12 Mar 92 14:58:57 CST Subject: No subject Message-ID: <9203122058.AA01720@orion.ssdc.honeywell.com> TR availble: (comments and suggestions are welcome) ------------------------------------------------------------------ An Efficient Feature-based Connectionist Inheritance Scheme Ron Sun Honeywell SSDC 3660 Technology Drive Minneapolis, MN 55418 The paper describes how a connectionist architecture deals with the inheritance problem in an efficient and natural way. Based on the connectionist architecture CONSYDERR, we analyze the problem of property inheritance and formulate it in ways facilitating conceptual clarity and implementation. A set of ``benchmarks" is specified for ensuring the correctness of inheritance mechanisms. Parameters of CONSYDERR are formally derived to satisfy these benchmark requirements. We discuss how chaining of is-a links and multiple inheritance can be handled in this architecture. This paper shows that CONSYDERR with a two-level dual (localist and distributed) representation can handle inheritance and cancellation of inheritance correctly and extremely efficiently, in constant time instead of proportional to the length of a chain in an inheritance hierarchy. It also demonstrates the utility of a meaning-oriented, intensional approach (with features) for supplementing and enhancing extensional approaches. ---------------------------------------------------------------- It is FTPable from archive.cis.ohio-state.edu in: pub/neuroprose (Courtesy of Jordan Pollack) No hardcopy available. FTP procedure: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get sun.inh.ps.Z ftp> quit unix> uncompress sun.inh.ps.Z unix> lpr sun.inh.ps (or however you print postscript)  From FEGROSS at weizmann.weizmann.ac.il Fri Mar 13 02:58:21 1992 From: FEGROSS at weizmann.weizmann.ac.il (Tal Grossman) Date: Fri, 13 Mar 92 09:58:21 +0200 Subject: linear separability Message-ID: The issue of verifying whether a given set of vectors is linearly separable or not was discussed time and again in this forum, and it is certainly very interesting for many of us. I will therefore remind here a few old refs. and add one (hopefully relevant) insight and one new (surely relevant) reference. The basic geometrical approach is that of the convex hulls of the two classes: if their intersection is non empty, then the two sets are not linearly separable (and vice versa). This method is really old (the Highleyman paper). Related methods can be found in Lewis and Coates, "Threshold Logic" (Wiley 1967). In terms of computational complexity these methods are not any better than linear programming. About the idea of using the distances among the vectors: I suggest the following simple experiment - For large N (say 100), create P random, N dimensional binary vectors, and then make a histogram of the hamming distance between all pairs. Compare such histograms for P<2N (in this case the set is almost always linearly separable) and for P>2N (and then it is almost always non l.s.). You will find no difference. Which shows that as it is presented, your approach can not help in verifying linear separability. The last item is a new perceptron type learning algorithm by Nabutovsky and Domany (Neural Computation 3 (1991) 604). It either finds a solution (namely, a separating vector) to a given set, or stops with a definite conclusion that the problem is non separable. and of course I would be glad to hear about any new idea - Tal Grossman (fegross at weizmann) Electronics Dept. Weizmann Inst. of Science Rehovot 76100 ISRAEL  From smieja at jargon.gmd.de Fri Mar 13 10:27:11 1992 From: smieja at jargon.gmd.de (Frank Smieja) Date: Fri, 13 Mar 92 16:27:11 +0100 Subject: TR (reflective) Message-ID: <9203131527.AA27751@jargon.gmd.de> -) ******************************************************************* -) REFLECTIVE MODULAR NEURAL NETWORK SYSTEMS -) -) F. J. Smieja and H. Muehlenbein -) -) German National Research Centre for Computer Science (GMD) -) Schlo{\ss} Birlinghoven, -) 5205 St. Augustin 1, -) Germany. -) -) ABSTRACT -) -) Many of the current artificial neural network systems have serious -) limitations, concerning accessibility, flexibility, scaling and -) reliability. In order to go some way to removing these we suggest a -) {\it reflective neural network architecture}. In such an architecture, -) the modular structure is the most important element. The -) building-block elements are called ``\MINOS'' modules. They perform -) {\it self-observation\/} and inform on the current level of -) development, or scope of expertise, within the module. A {\it -) Pandemonium\/} system integrates such submodules so that they work -) together to handle mapping tasks. Network complexity limitations are -) attacked in this way with the Pandemonium problem decomposition -) paradigm, and both static and dynamic unreliability of the whole -) Pandemonium system is effectively eliminated through the generation -) and interpretation of {\it confidence\/} and {\it ambiguity\/} -) measures at every moment during the development of the system. -) -) Two problem domains are used to test and demonstrate various aspects -) of our architecture. {\it Reliability\/} and {\it quality\/} measures -) are defined for systems that only answer part of the time. Our system -) achieves better quality values than single networks of larger size for -) a handwritten digit problem. When both second and third best answers -) are accepted, our system is left with only 5\% error on the test set, -) 2.1\% better than the best single net. It is also shown how the -) system can elegantly learn to handle garbage patterns. With the -) parity problem it is demonstrated how complexity of problems may be -) decomposed automatically by the system, through solving it with -) networks of size smaller than a single net is required to be. Even -) when the system does not find a solution to the parity problem, -) because networks of too small a size are used, the reliability remains -) around 99--100\%. -) -) Our Pandemonium architecture gives more power and flexibility to the -) higher levels of a large hybrid system than a single net system can, -) offering useful information for higher-level feedback loops, through -) which reliability of answers may be intelligently traded for less -) reliable but important ``intuitional'' answers. In providing weighted -) alternatives and possible generalizations, this architecture gives the -) best possible service to the larger system of which it will form part. -) -) Keywords: Reflective architecture, Pandemonium, task decomposition, -) confidence, reliability. -) ******************************************************************** -) -) -) ---------------------------------------------------------------- -) FTP INSTRUCTIONS -) -) unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) -) Name: anonymous -) Password: neuron -) ftp> cd pub/neuroprose -) ftp> binary -) ftp> get smieja.reflect.ps.Z -) ftp> bye -) unix% zcat smieja.reflect.ps.Z | lpr -) (or whatever *you* do to print a compressed PostScript file) -) ---------------------------------------------------------------- -) Apparently the original format was such that it was not possible to print out on American-sized paper. Therefore I have changed the format and re-inserted the file smieja.reflect.ps.Z into the neuroprose archive. It should be all on the sheet now. Instructions as before. -Frank Smieja  From LWCHAN at CUCSD.CUHK.HK Fri Mar 13 04:31:00 1992 From: LWCHAN at CUCSD.CUHK.HK (LAI-WAN CHAN) Date: Fri, 13 Mar 1992 17:31 +0800 Subject: Internal representations Message-ID: <7C58C87DE020033B@CUCSD.CUHK.HK> > I am interested in the internal representations developing in hidden > units as a result of training with BP algorithm. I have done some I have done experiments to find out the internal representations of the BP net [1]. I used some training sets and looked at their hidden nodes. The hidden nodes showed particular arrangement (e.g. residing on a circle) for some training patterns. I did not include any results on the digit recognition but I found some hidden nodes have been trained to be responsible for some feature detection. [1] Analysis of the Internal Representations in Neural Networks for Machine Intelligence, Lai-Wan CHAN, AAAI-91, Vol.2, p578-583, 1991. Lai-Wan Chan, Computer Science Dept, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. email : lwchan at cucsd.cuhk.hk tel : (+852) 609 8865 FAX : (+852) 603 5024  From kamil at apple.com Fri Mar 13 16:44:36 1992 From: kamil at apple.com (Kamil A. Grajski) Date: Fri, 13 Mar 92 13:44:36 -0800 Subject: Summer Internships at Apple Computer Message-ID: <9203132144.AA28397@apple.com> Unofficial announcement Summer Internships at Apple Computer in Cupertino, CA The Speech & Language Technologies Department in Apple's Advanced Technology Group has summer internship positions available. The typical intern experience is to focus on one project under the close supervision of one or more senior researchers/engineers. In the past, intern projects in this group have resulted in lasting contributions - not just busy work! There is a formal review process, end of summer presentation, etc. There are corporate-wide summer intern social and professional events, too. Qualifications can span several areas. Upper division undergraduate, or early graduate students preferred in the following areas: a.) general speech processing; b.) front-end signal processing, ; c.) statistical pattern recognition, e.g., HMMs, general methods; d.) speech synthesis; e.) natural language; and f.) Macintosh (MPW) programming. This is not an exhasutive list, but the main point is that the candidate should have a strong committment to do doing really great work in speech technology. I apologize in advance that I will NOT be able to acknowledge each and every inquiry, individually. Kamil A. Grajski kamil at apple.com  From thildebr at aragorn.csee.lehigh.edu Sat Mar 14 13:39:33 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Sat, 14 Mar 92 13:39:33 -0500 Subject: Internal representations In-Reply-To: LAI-WAN CHAN's message of Fri, 13 Mar 1992 17:31 +0800 <7C58C87DE020033B@CUCSD.CUHK.HK> Message-ID: <9203141839.AA13493@aragorn.csee.lehigh.edu> See also: Bunpei Irie and Mitsuo Kawato, "Acquisition of internal representation by multi-layered perceptrons", Denshi Joohoo Tsuushin Gakkai Ronbunshi (Tr. of the Institute of Electronic Communication Engineers (of Japan)), V.J73-D-II, N.8, pp.1173-1178 (Aug 1990), in Japanese. My translation of this article will appear in Systems and Computers in Japan (Scripta Technica, Silver Spring, MD), but may be obtained directly (sans figures) by sending me an e-mail request. Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From FEDIMIT at weizmann.weizmann.ac.il Sun Mar 15 14:11:02 1992 From: FEDIMIT at weizmann.weizmann.ac.il (Dan Nabutovsky) Date: Sun, 15 Mar 92 21:11:02 +0200 Subject: Linear nonseparability Message-ID: > From: christoph bruno herwig > I was wondering if someone could point me in the right direction > concerning the following fundamental separability problem: > Given a binary (-1/1) valued training set consisting of n-dimensional > input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias > term as the n-th dimension) and 1-dimensional target vectors. For this > 2-class classification problem I wish to prove (non-) linear separability > solely on the basis of the given training set (hence determine if > the underlying problem may be solved with a 2-layer feedforward network). An algorithm that solves this problem is described in our paper: D. Nabutovsky & E. Domany, "Learning the Unlearnable", Neural Computation 3(1991), 604. We present perceptron learning rule that finds separation plane when a set of patterns is linearly separable, and proves linear non-separability otherwise. Our approach is completely different from those described by Christoph. Our idea is to do perceptron learning, always keeping in mind constraint for the distance between current vector and solution. When this constraint becomes impossible, nonseparability is proved. Using sophisticated choice of learning step size, we ensure that algorithm always finds a solution or proves its absence in a finite number of steps. Dan Nabutovsky (FEDIMIT at WEIZMANN.WEIZMANN.AC.IL)  From bradley at ivy.Princeton.EDU Sun Mar 15 19:21:16 1992 From: bradley at ivy.Princeton.EDU (Bradley Dickinson) Date: Sun, 15 Mar 92 19:21:16 EST Subject: Nominations Sought for IEEE NNC Awards Message-ID: <9203160021.AA26259@ivy.Princeton.EDU> Nominations Sought for IEEE Neural Networks Council Awards The IEEE Neural Networks Council is soliciting nominations for its two awards. The awards will be presented at the June 1992 International Joint Conference on Neural Networks. Nominations for these awards should be submitted in writing according to the instructions given below. ------------------------------------------------------------------ IEEE Transactions on Neural Networks Outstanding Paper Award This is an award of $500 for the outstanding paper published in the IEEE Transactions on Neural Networks in the previous two-year period. For 1992, all papers published in 1990 (Volume 1) and in 1991 (Volume 2) in the IEEE Transactions on Neural Networks are eligible. For a paper with multiple authors, the award will be shared by the coauthors. Nominations must include a written statement describing the outstanding characteristics of the paper. The deadline for receipt of nominations is April 20, 1992. Nominations should be sent to Prof. Bradley W. Dickinson, NNC Awards Chair, Dept. of Electrical Engineering, Princeton University, Princeton, NJ 08544-5263. ------------------------------------------------------------------------ IEEE Neural Networks Council Pioneer Award This award has been established to recognize and honor the vision of those people whose efforts resulted in significant contributions to the early concepts and developments in the neural networks field. Up to three awards may be presented annually to outstanding individuals whose main contribution has been made at least fifteen years earlier. The recognition is engraved on the Neural Networks Pioneer Medal specially struck for the Council. Selection of Pioneer Medalists will be based on nomination letters received by the Pioneer Awards Committee. All who meet the contribution requirements are eligible, and anyone can nominate. The award is not approved posthumously. Written nomination letters must include a detailed description of the nominee's contributions and must be accompanied by full supporting documentation. For the 1992 Pioneer Award, nominations must be received by April 20, 1992. Nominations should be sent to Prof. Bradley W. Dickinson, NNC Pioneer Award Chair, Department of Electrical Engineering, Princeton University, Princeton, NJ 08544-5263. ----------------------------------------------------------------------------:x Questions and preliminary inquiries about the above awards should be directed to Prof. Bradley W. Dickinson, NNC Awards Chair; telephone: (609)-258-2916, electronic mail: bradley at ivy.princeton.edu  From dhw at santafe.edu Fri Mar 13 19:18:48 1992 From: dhw at santafe.edu (David Wolpert) Date: Fri, 13 Mar 92 17:18:48 MST Subject: New paper Message-ID: <9203140018.AA25119@sfi.santafe.edu> ******* DO NOT FORWARD TO OTHER LISTS ***************** The following paper has been placed in neuroprose: A RIGOROUS INVESTIGATION OF "EVIDENCE" AND "OCCAM FACTORS" IN BAYESIAN REASONING by David H. Wolpert Abstract: This paper first reviews the reasoning behind the Bayesian "evidence" procedure for setting parameters in the probability distributions involved in inductive inference. This paper then proves that the evidence procedure is incorrect. More precisely, this paper proves that the assumptions going into the evidence procedure do not, as claimed, "let the data determine the distributions". Instead, those assumptions simply amount to an implicit replacement of the original distributions, containing free parameters, with new distributions, none of whose parameters are free. For example, as used by MacKay [1991] in the context of neural nets, the evidence procedure is a means for using the training set to determine the free parameter alpha in the the prior distribution P({wi}) proportional to exp(alpha x S), where the N wi are the N weights in the network, and S is the sum of the squares of those weights. As this paper proves, in actuality the assumptions going into MacKay's use of the evidence procedure do not result in a distribution P({wi}) proportional to exp(alpha x S), for some alpha, but rather result in a parameter-less distribution, P({wi}) proportional to (S) ** [-(N/2 + 1)]. This paper goes on to prove that if one makes the assumption of an "entropic prior" with unknown parameter value, in addition to the assumptions used in the evidence procedure, then the prior is completely fixed, but in a form which can not be entropic. (This calls into question the self-consistency of the numerous arguments purporting to derive an entropic prior "from first principles".) Finally, this paper goes on to investigate the Bayesian first-principles "proof" of Occam's razor involving Occam factors. This paper proves that that "proof" is flawed. To retrieve this file, do the following: unix> ftp archive.cis.ohio-state.edu Name (archive.cis.ohio-state.edu:dhw): anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get wolpert.evidence.ps.Z ftp> quit unix> uncompress wolpert.evidence.ps.Z unix> lpr wolpert.evidence.ps  From marchman at merlin.psych.wisc.edu Mon Mar 16 15:03:10 1992 From: marchman at merlin.psych.wisc.edu (Virginia Marchman) Date: Mon, 16 Mar 92 14:03:10 -0600 Subject: TR AVAILABLE Message-ID: <9203162003.AA11810@merlin.psych.wisc.edu> ********************************************************************* CENTER FOR RESEARCH IN LANGUAGE UNIVERSITY OF CALIFORNIA, SAN DIEGO Technical Report #9201 ********************************************************************* LANGUAGE LEARNING IN CHILDREN AND NEURAL NETWORKS: PLASTICITY, CAPACITY AND THE CRITICAL PERIOD Virginia A. Marchman Department of Psychology, University of Wisconsin, Madison ABSTRACT This paper investigates constraints on dissociation and plasticity using connectionist models of the acquisition of an artificial language analogous to the English past tense. Several networks were "lesioned" in varying amounts both prior to and after the onset of training. In Study I, the network was trained on mappings similar to English regular verbs (e.g., walk ==> walked). Long term effects of injury were not observed in this simple homogeneous task, yet trajectories of development were dampened in relation to degree of damage prior to training, and post-natal lesions resulted in substantive short term performance deficits. In Study II, the vocabulary was comprised of regular, as well as irregular verbs (e.g., go ==> went). In intact nets, the acquisition of the regulars was considerably slowed, and performance was increasingly susceptible to injury, both acutely and in terms of eventual recovery, as a function of size and time of lesion. In contrast, irregulars were learned quickly and were relatively impervious to the effects of injury. Generalization to novel forms indicates that these behavioral dissociations result from the competition between the two classes of forms within a single mechanism system, rather than a selective disruption of the mechanism guiding the learning of regular forms. Two general implications for research on language development and breakdown are discussed: (1) critical period effects may derive from prior learning history in interaction with the language to be learned ("entrenchment"), rather than endogenously determined maturational change, and (2) selective dissociations in behavior CAN result from general damage in systems that are *not* modularized in terms of rule based vs. associative mechanisms (cf. Pinker, 1991). ********************************************************************* Hard copies of this report are available upon request from John at staight at crl.ucsd.edu. Please ask for CRL TR #9201, and provide your surface mailing address. In addition, this TR can be retrieved via anonymous ftp from the pub/neuralnets directory at crl.ucsd.edu. The entire report consists of 8 postscript files (1 text file, 7 files of figures). In order to ease retrieval, we have compiled these into a single tar file that must be extracted before printing. (Report is 20 pages total). Instructions: unix> ftp crl.ucsd.edu Connected to crl.ucsd.edu. 220 crl local FTP server (Version 5.85cub) ready. Name: anonymous 331 Guest login ok, send email address as password. Password: my-email-address ftp> cd pub/neuralnets 250 CWD command successful. ftp> binary 200 Type set to I. ftp> get tr9201.tar.Z ftp> bye unix> uncompress tr9201.tar.Z [MUST EXTRACT TAR FILE BEFORE PRINTING] unix> tar -xf tr9201.tar [RESULT IS 8 POSTSCRIPT FILES] unix> lpr tr9201.*.ps [or however you send your files to your postscript printer]  From thildebr at athos.csee.lehigh.edu Mon Mar 16 14:24:09 1992 From: thildebr at athos.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Mon, 16 Mar 92 14:24:09 -0500 Subject: Paper in neuroprose Message-ID: <9203161924.AA02714@athos.csee.lehigh.edu> The following paper, which has been submitted to IEEE Transactions on Neural Networks, is now available in PostScript format through the neuroprose archive: "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt ABSTRACT We compare the convergence properties of the batch and per-sample versions of the standard backpropagation algorithm. The comparison is made on the basis of ideal step sizes computed for the two algorithms with respect to a simplified, linear problem. For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case. It is found that the maximum fixed safe step size is $P$ times smaller for the batch version than for the per-sample version, where $P$ is the number of training examples. This fact is balanced somewhat by the fact that batch algorithm sums $P$ substeps in order to compute its step, meaning that the steps taken by the two algorithms are comparable in size. However, the batch algorithm takes only one step per epoch while the per-sample algorithm takes $P$. Thus, the conclusion is that the batch algorithm is $P$ times slower in a serial implementation. In response to last Fall's discussion involving Yann LeCun, Kamil Grajski and others regarding the unexpectedly poor performance of parallel implementations of the batch backpropagation algorithm, I performed an analysis of the convergence speed of batch and per-sample versions of the backpropagation algorithm based on calculation of the ideal step size. The conclusion is that, even if there are as many processors as training samples, the parallel implementation of a batch algorithm which does not alter its step size adaptively during an epoch can never be faster than the serial implementation of the per-sample algorithm. Due to the manner in which the problem is approached, it does not exactly go beyond the "multiple-copies" argument, as desired by LeCun. However, it does succeed in formalizing that argument. In the process, it also defines a relative measure of "redundancy" in the training set as correlation (the degree of collinearity) between the training vectors in the input space. Such a measure can be computed directly before training is begun. To obtain copies of this article: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name : anonymous Password: ftp> cd pub/neuroprose ftp> binary ftp> get hildebrandt.batch.ps.Z ftp> quit unix> uncompress hildebrandt.batch.ps.Z unix> lpr -Pps hildebrandt.batch.ps (or however you print PostScript) (Thanks to Jordan Pollack for providing this valuable service to the NN Research community.) Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From xiru at Think.COM Wed Mar 18 10:33:51 1992 From: xiru at Think.COM (xiru Zhang) Date: Wed, 18 Mar 92 10:33:51 EST Subject: Paper in neuroprose In-Reply-To: "Thomas H. Hildebrandt "'s message of Mon, 16 Mar 92 14:24:09 -0500 <9203161924.AA02714@athos.csee.lehigh.edu> Message-ID: <9203181533.AA01429@yangtze.think.com> Date: Mon, 16 Mar 92 14:24:09 -0500 From: "Thomas H. Hildebrandt " The following paper, which has been submitted to IEEE Transactions on Neural Networks, is now available in PostScript format through the neuroprose archive: "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt ABSTRACT We compare the convergence properties of the batch and per-sample versions of the standard backpropagation algorithm. The comparison is made on the basis of ideal step sizes computed for the two algorithms with respect to a simplified, linear problem. For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case. It is found that the maximum fixed safe step size is $P$ times smaller for the batch version than for the per-sample version, where $P$ is the number of training examples. This fact is balanced somewhat by the fact that batch algorithm sums $P$ substeps in order to compute its step, meaning that the steps taken by the two algorithms are comparable in size. However, the batch algorithm takes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only one step per epoch while the per-sample algorithm takes $P$. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Thus, the conclusion is that the batch algorithm is $P$ times slower ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ in a serial implementation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The last argument is not sound: the directions computed based on one training example and that based on all training examples can be very different, thus even if the step size is the same, the convergence rate can be different. This may not be a serious problem for the example in your paper, where the network has linear (identity) activation function and no hidden units, but in a multiple-layer network with non-linear units, not only the step size is important, the direction of the step is at least equally important. - Xiru Zhang  From thildebr at aragorn.csee.lehigh.edu Wed Mar 18 14:37:17 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Wed, 18 Mar 92 14:37:17 -0500 Subject: Paper in neuroprose In-Reply-To: xiru Zhang's message of Wed, 18 Mar 92 10:33:51 EST <9203181533.AA01429@yangtze.think.com> Message-ID: <9203181937.AA18295@aragorn.csee.lehigh.edu> From: xiru Zhang Date: Wed, 18 Mar 92 10:33:51 EST Excerpt from my abstract: However, the batch algorithm takes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only one step per epoch while the per-sample algorithm takes $P$. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Thus, the conclusion is that the batch algorithm is $P$ times slower ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ in a serial implementation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Comment by Zhang: The last argument is not sound: the directions computed based on one training example and that based on all training examples can be very different, thus even if the step size is the same, the convergence rate can be different. This may not be a serious problem for the example in your paper, where the network has linear (identity) activation function and no hidden units, but in a multiple-layer network with non-linear units, not only the step size is important, the direction of the step is at least equally important. - Xiru Zhang My rebuttal: The direction of the step taken by batch BP and the direction taken by per-sample BP are BOTH BAD, in the sense that they do not point directly at the minimum, but toward the bottom of the valley in the direction of steepest descent -- a different thing entirely. In the simplified case I examine, the batch algorithm steps daintily to the bottom of the valley after one epoch. In the mean time, the P-S algorithm is responding in turn to the individual components of the error function. In doing so, it often crosses the center of the valley in the total (batch) error surface. As a result, P-S takes larger steps on the average, and ends up closer to the minimum than does batch. In addition, this overstepping of the minimum makes the P-S version of the algorithm more robust toward local minima. This subject is dealt with more fully in the text of the paper. One thing I did not note in my paper is that, in the linear case, a batch step will take you to the bottom of the valley by the shortest route. This reduces the dimensionality of the search by 1, so after a number of epochs equal to the number of dimensions in your search space, you know that you are at the minimum. However, pleasant this mathematical fiction is, when you add nonlinearity to the picture, all bets are off. It MAY BE that the next step taken by the batch algorithm will place it squarely at the bottom of the valley, where it will remain. The same can be true for the P-S algorithm. However possible this may be, to me it does not appear probable. The paper I have posted begs to be followed by one which analyzes the likelihood of such a lucky step occurring, or at least gives a stochastic view of the paths which the two algorithms follow in approaching the minimum. I will look into it, time permitting. Viewed intuitively: In any guessing game, you are likely to do better if the number of queries you are allowed to make increases. The batch algorithm allows itself one query (as to the local slope of the error surface) per epoch, while the per-sample algorithm gets $P$ times as many. Thomas H. Hildebrandt CSEE Department Lehigh University  From TEPPER at CVAX.IPFW.INDIANA.EDU Wed Mar 18 15:42:20 1992 From: TEPPER at CVAX.IPFW.INDIANA.EDU (TEPPER@CVAX.IPFW.INDIANA.EDU) Date: Wed, 18 Mar 1992 15:42:20 -0500 (EST) Subject: PDP & NN5 Message-ID: <920318154220.20c023e3@CVAX.IPFW.INDIANA.EDU> Fifth NN & PDP CONFERENCE PROGRAM - April 9, 10 and 11,1992 ----------------------------------------------------------- The Fifth Conference on Neural Networks and Parallel Distributed Processing at Indiana University-Purdue University at Fort Wayne will be held April 9, 10, and 11, 1992. Conference registration is $20 (on site). Students and members or employees of supporting organizations attend free. Some limited financial support might also be available to allow students to attend. Inquiries should be addressed to: US mail: ------- Pr. Samir Sayegh Physics Department Indiana University-Purdue University Fort Wayne, IN 46805-1499 email: sayegh at ipfwcvax.bitnet ----- FAX: (219)481-6880 --- Voice: (219) 481-6306 OR 481-6157 ----- All talks will be held in Kettler Hall, Room G46: Thursday, April 9, 6pm-9pm; Friday Morning & Afternoon (Tutorial Sessions), 8:30am-12pm & 1pm-4:30pm and Friday Evening 6pm-9pm; Saturday, 9am-12noon. Parking will be available near the Athletic Building or at any Blue A-B parking lots. Do not park in an Orange A lot or you may get a parking violation ticket. Special hotel rates (IPFW corporate rates) are available at Canterbury Green, which is a 5 minute drive from the campus. The number is (219) 485-9619. The Marriott Hotel also has corporate rates for IPFW and is about a 10 minute drive. Their number is (219) 484-0411. Another hotel with corporate rates for IPFW is Don Hall's Guesthouse (about 10 minutes away). Their number is (219) 489-2524. The following talks will be presented: Applications I - Thursday 6pm-7:30pm -------------------------------------- Nasser Ansari & Janusz A. Starzyk, Ohio University. DISTANCE FIELD APPROACH TO HANDWRITTEN CHARACTER RECOGNITION Thomas L. Hemminger & Yoh-Han Pao, Case Western Reserve University. A REAL- TIME NEURAL-NET COMPUTING APPROACH TO THE DETECTION AND CLASSIFICATION OF UNDERWATER ACOUSTIC TRANSIENTS Seibert L. Murphy & Samir I. Sayegh, Indiana-Purdue University. ANALYSIS OF THE CLASSIFICATION PERFORMANCE OF A BACK PROPAGATION NEURAL NETWORK DESIGNED FOR ACOUSTIC SCREENING S. Keyvan, L. C. Rabelo, & A. Malkani, Ohio University. NUCLEAR DIAGNOSTIC MONITORING SYSTEM USING ADAPTIVE RESONANCE THEORY J.L. Fleming & D.G. Hill, Armstrong Lab, Brooks AFB. STUDENT MODELING USING ARTIFICIAL NEURAL NETWORKS Biological and Cooperative Phenomena Optimization I - Thursday 7:50pm-9pm --------------------------------------------------------------------------- Ljubomir T. Citkusev & Ljubomir J., Buturovic, Boston University. NON- DERIVATIVE NETWORK FOR EARLY VISION Yalin Hu & Robert J. Jannarone, University of South Carolina. A NEUROCOMPUTING KERNEL ALGORITHM FOR REAL-TIME, CONTINUOUS COGNITIVE PROCESSING M.B. Khatri & P.G. Madhavan, Indiana-Purdue University, Indianapolis. ANN SIMULATION OF THE PLACE CELL PHENOMENON USING CUE SIZE RATIO Mark M. Millonas, University of Texas at Austin. CONNECTIONISM AND SWARM INTELLIGENCE --------------------------------------------------------------------------- --------------------------------------------------------------------------- Tutorials I - Friday 8:30am-11:45am ------------------------------------- Bill Frederick, Indiana-Purdue University. INTRODUCTION TO FUZZY LOGIC Helmut Heller, University of Illinois. INTRODUCTION TO TRANSPUTER SYSTEMS Arun Jagota, SUNY-Buffalo. THE HOPFIELD NETWORK, ASSOCIATIVE MEMORIES, AND OPTIMIZATION Tutorials II - Friday 1:15pm-4:30pm ------------------------------------- Krzysztof J. Cios, University Of Toledo. SELF-GENERATING NEURAL NETWORK ALGORITHM : CID3 APPLICATION TO CARDIOLOGY Robert J. Jannarone, University of South Carolina. REAL-TIME NEUROCOMPUTING, AN INTRODUCTION Network Analysis I - Friday 6pm-7:30pm ---------------------------------------- M.R. Banan & K.D. Hjelmstad, University of Illinois at Urbana-Champaign. A SUPERVISED TRAINING ENVIRONMENT BASED ON LOCAL ADAPTATION, FUZZINESS, AND SIMULATION Pranab K. Das II, University of Texas at Austin. CHAOS IN A SYSTEM OF FEW NEURONS Arun Maskara & Andrew Noetzel, University Heights. FORCED LEARNING IN SIMPLE RECURRENT NEURAL NETWORKS Samir I. Sayegh, Indiana-Purdue University. SEQUENTIAL VS CUMULATIVE UPDATE: AN EXPANSION D.A. Brown, P.L.N. Murthy, & L. Berke, The College of Wooster. SELF- ADAPTATION IN BACKPROPAGATION NETWORKS THROUGH VARIABLE DECOMPOSITION AND OUTPUT SET DECOMPOSITION Applications II - Friday 7:50pm-9pm ------------------------------------- Susith Fernando & Karan Watson, Texas A & M University. ANNs TO INCORPORATE ENVIRONMENTAL FACTORS IN HI FAULTS DETECTION D.K. Singh, G.V. Kudav, & T.T. Maxwell, Youngstown State University. FUNCTIONAL MAPPING OF SURFACE PRESSURES ON 2-D AUTOMOTIVE SHAPES BY NEURAL NETWORKS K. Hooks, A. Malkani, & L. C. Rabelo, Ohio University. APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN QUALITY CONTROL CHARTS B.E. Stephens & P.G. Madhavan, Purdue University at Indianapolis. SIMPLE NONLINEAR CURVE FITTING USING THE ARTIFICIAL NEURAL NETWOR ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Network Analysis II - Saturday 9am-10:30am ------------------------------------------- Sandip Sen, University of Michigan. NOISE SENSITIVITY IN A SIMPLE CLASSIFIER SYSTEM Xin Wang, University of Southern California. DYNAMICS OF DISCRETE-TIME RECURRENT NEURAL NETWORKS: PATTERN FORMATION AND EVOLUTION Zhenni Wang and Christine Di Massimo, University of Newcastle. A PROCEDURE FOR DETERMINING THE CANONICAL STRUCTURE OF MULTILAYER NEURAL NETWORKS Srikanth Radhakrishnan, Tulane University. PATTERN CLASSIFICATION USING THE HYBRID COULOMB ENERGY NETWORK Biological and Cooperative Phenomena Optimization II - Saturday 10:50am-12noon ------------------------------------------------------------------------------- J. Wu, M. Penna, P.G. Madhavan, & L. Zheng, Purdue University at Indianapolis. COGNITIVE MAP BUILDING AND NAVIGATION C. Zhu, J. Wu, & Michael A. Penna, Purdue University at Indianapolis. USING THE NADEL TO SOLVE THE CORRESPONDENCE PROBLEM Arun Jagota, SUNY-Buffalo. COMPUTATIONAL COMPLEXITY OF ANALYZING A HOPFIELD-CLIQUE NETWORK Assaad Makki, & Pepe Siy, Wayne State University. OPTIMAL SOLUTIONS BY MODIFIED HOPFIELD NEURAL NETWORKS  From thildebr at aragorn.csee.lehigh.edu Wed Mar 18 15:53:30 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Wed, 18 Mar 92 15:53:30 -0500 Subject: Paper in neuroprose In-Reply-To: garyc@cs.uoregon.edu's message of Wed, 18 Mar 92 11:35:13 -0800 <9203181935.AA29887@sisters.cs.uoregon.edu> Message-ID: <9203182053.AA18310@aragorn.csee.lehigh.edu> Date: Wed, 18 Mar 92 11:35:13 -0800 From: garyc at cs.uoregon.edu But your measure of redundancy - collinearity - seems appropriate for your linear domain; what about redundancy for a nonlinear map? gary cottrell I think that the appropriate measure is the degree of collinearity of the training vectors in class space, i.e. after the nonlinear mapping has been performed. Obviously, this requires you to know the answer (i.e. have in hand the completely trained network) before you can measure redundancy, so the measure is not very useful. However, if you accept it as the correct definition of redundancy, then you can apply certain assumptions (e.g. local linearity of the input space, linearity in certain subspaces, etc.) which will allow you to estimate the measure a priori with varying degrees of accuracy. Thomas H. Hildebrandt CSEE Department Lehigh University  From garyc at cs.uoregon.edu Wed Mar 18 14:35:13 1992 From: garyc at cs.uoregon.edu (garyc@cs.uoregon.edu) Date: Wed, 18 Mar 92 11:35:13 -0800 Subject: Paper in neuroprose Message-ID: <9203181935.AA29887@sisters.cs.uoregon.edu> But your measure of redundancy - collinearity - seems appropriate for your linear domain; what about redundancy for a nonlinear map? gary cottrell  From berenji at ptolemy.arc.nasa.gov Thu Mar 19 20:06:11 1992 From: berenji at ptolemy.arc.nasa.gov (Hamid Berenji) Date: Thu, 19 Mar 92 17:06:11 PST Subject: FUZZ-IEEE'93 call for papers (revised) Message-ID: Please place the following call for papers on your mailing list. Thank you, Hamid R. Berenji Senior Research Scientist NASA Ames Research Center *************************** CALL FOR PAPERS SECOND IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS FUZZ-IEEE'93 San Francisco, California March 28 - April 1, 1993 In recent years, increasing attention has been devoted to fuzzy-logic approaches and to their application to the solution of real-world problems. The Second IEEE International Conference on Fuzzy Systems (FUZZ-IEEE '93) will be dedicated to the discussion of advances in: * Basic Principles and Foundations of Fuzzy Logic * Relations between Fuzzy Logic and other Approximate Reasoning Methods * Qualitative and Approximate-Reasoning Modeling * Hardware Implementations of Fuzzy-Logic Algorithms * Learning and Acquisition of Approximate Models * Relations between Fuzzy Logic and Neural Networks * Applications to * System Control * Intelligent Information Systems * Case-Based Reasoning * Decision Analysis * Signal Processing * Image Understanding * Pattern Recognition * Robotics and Automation * Intelligent Vehicle and Highway Systems This conference will be held concurrently with the 1993 IEEE International Conference on Neural Networks. Participants will be able to attend the technical events of both meetings. CONFERENCE ORGANIZATION This conference is sponsored by the IEEE Neural Networks Council, in cooperation with: International Fuzzy Systems Association North American Fuzzy Information Processing Society Japan Society for Fuzzy Theory and Systems. IEEE Systems, Man, and Cybernetics Society ELITE - European Laboratory for Intelligent Techniques Engineering The conference includes tutorials, exhibits, plenary sessions, and social events. ORGANIZING COMMITTEE GENERAL CHAIR: Enrique H.Ruspini Artificial Intelligence Center SRI International CHAIR: Piero P. Bonissone General Electric CR&D PROGRAM ADVISORY BOARD: J. Bezdek E. Sanchez E. Trillas D. Dubois Ph. Smets T. Yamakawa G. Klir M. Sugeno L.A. Zadeh H. Prade T. Terano H.J. Zimmerman FINANCE: R. Tong (Chair) R. Nutter PUBLICITY: H. Berenji (Chair) B. D'Ambrosio R. Lopez de Mantaras T. Takagi LOCAL ARRANGEMENTS: S. Ovchinnikov TUTORIALS: J. Bezdek (Chair) H. Berenji H. Watanabe EXHIBITS: A. Ralescu M. Togai L. Valverde W. Xu T. Yamakawa H.J. Zimmerman TUTORIAL INFORMATION The following tutorials have been scheduled: Introduction to Fuzzy-Set Theory, Uncertainty, and Fuzzy Logic Prof. George J. Klir, SUNY Fuzzy Logic in Databases and Information Retrieval Prof. Maria Zemankova, NSF Fuzzy Logic and Neural Networks for Pattern Recognition Prof. James C. Bezdek, Univ. of West Florida Hardware Approaches to Fuzzy-Logic Applications Prof. Hiroyuki Watanabe, Univ. North Carolina Fuzzy Logic and Neural Networks for Control Systems Dr. Hamid R. Berenji, NASA Ames Research Center Fuzzy Logic and Neural Networks for Computer Vision Prof. James Keller, Univ. of Missouri EXHIBIT INFORMATION Exhibitors are encouraged to present the latest innovations in fuzzy hardware, software, and systems based on applications of fuzzy logic. For additional information, please contact Meeting Management at Tel. (619) 453-6222, FAX (619) 535-3880. CALL FOR PAPERS In addition to the papers related to any of the above areas, the program committee cordially invites interested authors to submit papers dealing with any aspects of research and applications related to the use of fuzzy models. Papers will be carefully reviewed and only accepted papers will appear in the FUZZ-IEEE '93 Proceedings. DEADLINE FOR PAPERS: September 21, 1992 Papers must be received by September 21, 1992. Six copies of the paper must be submitted. The paper must be written in English and its length should not exceed 8 pages including figures, tables, and references. Papers must be submitted on 8-1/2" x 11" white paper with 1" margins on all four sides. They should be prepared by typewriter or letter-quality printer in one column format, single-spaced, in Times or similar type style, 10 points or larger, and printed on one side of the paper only. Please include title, author(s) name(s) and affiliation(s) on top of first page followed by an abstract. FAX submissions are not acceptable. Please send submissions prior to the deadline to: Dr. Piero P. Bonissone General Electric Corporate Research and Development Building K-1, Room 5C32A 1 River Road Schenectady, New York 12301 FOR ADDITIONAL INFORMATION REGARDING FUZZ-IEEE'93 PLEASE CONTACT: Meeting Management 5665 Oberlin Drive Suite 110 San Diego CA 92121 Tel. (619) 453-6222 FAX (619) 535-3880 -------  From guy at minster.york.ac.uk Thu Mar 19 08:13:10 1992 From: guy at minster.york.ac.uk (guy@minster.york.ac.uk) Date: 19 Mar 1992 13:13:10 GMT Subject: paper at neuroprose: Neural Networks as Components Message-ID: A paper "Neural Networks as Components", which is to be presented at the SPIE Conference on the Science of Artificial Neural Networks in Orlando, Florida in April, is available in the neuroprose archive as "smith.components.ps.Z". The paper discusses the use of neural networks as system components, and suggests research has concentrated too much on algorithms in isolation. The desirable properties of good components are listed: uniform interface, wide range of functionality and performance, and robustness. The benefits of viewing networks as plug compatible components are shown in a design example. Directions for research are suggested. The neuroprose archive is at internet address "128.146.8.52" in directory "pub/neuroprose". Limited numbers of hard copies may be available. The paper will be in conference proceedings. Happy reading, Guy Smith.  From nin at cns.brown.edu Thu Mar 19 10:59:36 1992 From: nin at cns.brown.edu (Nathan Intrator) Date: Thu, 19 Mar 92 10:59:36 EST Subject: Why batch learning is slower Message-ID: <9203191559.AA09604@cns.brown.edu> "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt From the abstract: "...For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case." ------- There is no "FIXED safe step size" for the stochastic version, namely there is no convergence proof for a fixed learning rate of the stochastic version. The paper cited by Chung-Ming Kuan and Kurt Hornik does not imply that either. It is therefore difficult to draw conclusions from this paper. - Nathan  From tenorio at ecn.purdue.edu Fri Mar 20 11:55:35 1992 From: tenorio at ecn.purdue.edu (tenorio@ecn.purdue.edu) Date: Fri, 20 Mar 1992 10:55:35 -0600 Subject: workshop invitations for PEEII Message-ID: <9203201545.AA18429@dynamo.ecn.purdue.edu> > >"NETS WORK" WORKSHOP > >Neural network research will be the theme of the Spring 1992 >Purdue Electrical Engineering Industrial Institute (PEEII) workshop >on Monday and Tuesday, April 6 and 7. > >The workshop is a regular feature of our industrial affiliates program, >and we would like representatives of your organization to be our guests >at this meeting. > >Presentations will include: > > The Parallel Distributed Processing Lab and Self-Organizing Structures > Similarity-Based Algorithms for Prediction and Control > Fusing Algorithm Responses: Multiple Networks Cooperating > on a Single Task > Applications of Feedforward Neural Networks to the Control of Dynamical > Systems > Fuzzy Neural Networks > A Neural-Network-Based Fuzzy Logic Control and Decision System > Novel Neural Network Architectures and Their Applications > Parallel, Self-Organizing, Hierachical Neural Networks with Continuous > Inputs and Outputs > Solving Constrained Optimization Problems with Artificial Neural > Networks > Learning Algorithms in Associative Memories > Neural Computing With Linear Threshold Elements > Implementation of Neural Networks on Highly-Parallel Computers >Keynote Address: >Applying Neural Networks for Process Understanding and Process Control > > >Presentation schedules and workshop registration forms are available from: > >Mary Moyars-Johnson >Manager,Industrial Relations >phone: (317) 494-3441 >e-mail: moyars at ecn.purdue.edu > >Workshop reservations must be made by March 26. > >Room reservations may be made at the Union Club (317) 494-8913 >or at local hotels/motels. > > > < Manoel Fernando Tenorio > < (tenorio at ecn.purdue.edu) or (..!pur-ee!tenorio) > < MSEE233D > < Parallel Distributed Structures Laboratory > < School of Electrical Engineering > < Purdue University > < W. Lafayette, IN, 47907 > < Phone: 317-494-3482 Fax: 317-494-6440 >  From thildebr at athos.csee.lehigh.edu Fri Mar 20 10:37:57 1992 From: thildebr at athos.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Fri, 20 Mar 92 10:37:57 -0500 Subject: Why batch learning is slower In-Reply-To: Nathan Intrator's message of Thu, 19 Mar 92 10:59:36 EST <9203191559.AA09604@cns.brown.edu> Message-ID: <9203201537.AA04951@athos.csee.lehigh.edu> Date: Thu, 19 Mar 92 10:59:36 EST From: nin at cns.brown.edu (Nathan Intrator) "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt From the abstract: "...For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case." ------- There is no "FIXED safe step size" for the stochastic version, namely there is no convergence proof for a fixed learning rate of the stochastic version. The paper cited by Chung-Ming Kuan and Kurt Hornik does not imply that either. It is therefore difficult to draw conclusions from this paper. - Nathan I have not done it, but it appears straightforward to show convergence for the linear network model with a fixed step size. The actual step taken is the product of the step size with the derivative of the error. If each step taken reduces the error in an unbiased way, then the process will converge. In this, I am not really treating a stochastic version, since in the true sense, this would make the training set an infinite sequence of random vectors. For both algorithms I assumed that there is a finite set of training vectors which can be examined repeatedly. I think this is a fairly standard assumption. It IS difficult to draw firm conclusions from this paper regarding the behavior of the two versions of BP on multilayer nonlinear networks, since the analysis is restricted to a single-layer linear network. It was intended to provide some intuition as to the unexpectedly poor performance of parallel implementations of batch BP, and to suggest an approach for the analysis of the multilayer nonlinear case. Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From UDAH256 at oak.cc.kcl.ac.uk Fri Mar 20 07:46:00 1992 From: UDAH256 at oak.cc.kcl.ac.uk (Mark Plumbley) Date: Fri, 20 Mar 92 12:46 GMT Subject: M.Sc. and Ph.D. Courses in NNs at King's College London Message-ID: Fellow Connectionists, Please post or forward this announcement about our M.Sc. and Ph.D. courses to anyone who might be interested. Please direct any enquiries about the courses to the postgraduate secretary (address at the end of the notice). Thanks, Mark. ------------------------------------------------------------------------- Dr. Mark D. Plumbley M.Plumbley at oak.cc.kcl.ac.uk Tel: +44 71 873 2241 Centre for Neural Networks Fax: +44 71 873 2017 Department of Mathematics/King's College London/Strand/London WC2R 2LS/UK ------------------------------------------------------------------------- CENTRE FOR NEURAL NETWORKS and DEPARTMENT OF MATHEMATICS King's College London Strand London WC2R 2LS, UK M.Sc. AND Ph.D. COURSES IN NEURAL NETWORKS --------------------------------------------------------------------- M.Sc. in INFORMATION PROCESSING and NEURAL NETWORKS --------------------------------------------------- A ONE YEAR COURSE CONTENTS Dynamical Systems Theory Fourier Analysis Biosystems Theory Advanced Neural Networks Control Theory Combinatorial Models of Computing Digital Learning Digital Signal Processing Theory of Information Processing Communications Neurobiology REQUIREMENTS First Degree in Physics, Mathematics, Computing or Engineering NOTE: For 1992/93 we have 3 SERC quota awards for this course, which must be allocated by 30th July 1992. --------------------------------------------------------------------- Ph.D. in NEURAL COMPUTING ------------------------- A 3-year Ph.D. programme in NEURAL COMPUTING is offered to applicants with a First degree in Mathematics, Computing, Physics or Engineering (others will also be considered). The first year consists of courses given under the M.Sc. in Information Processing and Neural Networks (see attached notice). Second and third year research will be supervised in one of the various programmes in the development and application of temporal, non-linear and stochastic features of neurons in visual, auditory and speech processing. There is also work in higher level category and concept formation and episodic memory storage. Analysis and simulation are used, both on PC's SUNs and main frame machines, and there is a programme on the development and use of adaptive hardware chips in VLSI for pattern and speed processing. This work is part of the activities of the Centre for Neural Networks in the School of Physical Sciences and Engineering, which has 47 researchers in Neural Networks. It is one of the main centres of the subject in the U.K. --------------------------------------------------------------------- For further information on either of these courses please contact: Postgraduate Secretary Department of Mathematics King's College London Strand London WC2R 2LS, UK  From arun at hertz.njit.edu Mon Mar 16 13:44:56 1992 From: arun at hertz.njit.edu (arun maskara spec lec cis) Date: Mon, 16 Mar 92 13:44:56 -0500 Subject: Paper available in Neuroprose Message-ID: <9203161844.AA23382@hertz.njit.edu> The following paper is now available by ftp from neuroprose archive: Forcing Simple Recurrent Neural Networks to Encode Context Arun Maskara, New Jersey Institute of Technology, Department of Computer and Information Sciences University Heights, Newark, NJ 07102, arun at hertz.njit.edu Andrew Noetzel, The William Paterson College, Department of Computer Science, Wayne, NJ 07470 Abstract The Simple Recurrent Network (SRN) is a neural network model that has been designed for the recognition of symbol sequences. It is a back-propagation network with a single hidden layer of units. The symbols of a sequence are presented one at a time at the input layer. But the activation pattern in the hidden units during the previous input symbol is also presented as an auxiliary input. In previous research, it has been shown that the SRN can be trained to behave as a finite state automaton (FSA) which accepts the valid strings corresponding to a particular grammar and rejects the invalid strings. It does this by predicting each successive symbol in the input string. However, the SRN architecture sometime fails to encode the context necessary to predict the next input symbol. This happens when two different states in the FSA generating the strings have the same output, and the SRN develops similar hidden layer encodings for these states. The failure happens more often when number of units in the hidden layer is limited. We have developed a new architecture, called the Forced Simple Recurrent Network (FSRN), that solves this problem. This architecture contains additional output units, which are trained to show the current input and the previous context. Simulation results show that for certain classes of FSA with $u$ states, the SRN with $\lceil \log_2u \rceil$ units in the hidden layers fails, where as the FSRN with the same number of hidden layer units succeeds. ------------------------------------------------------------------------------- Copy of the postscript file has been placed in neuroprose archive. The file name is maskara.fsrn.ps.Z The usual instructions can be followed to obtain the file from the directory pub/neuroprose from the ftp site archive.cis.ohio-state.edu Arun Maskara  From ecai92 at ai.univie.ac.at Mon Mar 23 06:23:10 1992 From: ecai92 at ai.univie.ac.at (ECAI92 Vienna Conference Service) Date: Mon, 23 Mar 1992 12:23:10 +0100 Subject: ECAI92 Advance Information Message-ID: <199203231123.AA11857@dublin.ai.univie.ac.at> ======================================================================= Advance Information - ECAI92 - Advance Information - ECAI92 - VIENNA ======================================================================= 10th European Conference on Artificial Intelligence (ECAI 92) August 3-7, 1992, Vienna, Austria Programme Chairperson Bernd Neumann, University of Hamburg, Germany Local Arrangements Chairperson Werner Horn, Austrian Research Institute for AI, Vienna The European Conference on Artificial Intelligence (ECAI) is the European forum for scientific exchange and presentation of AI research. The aim of the conference is to cover all aspects of AI research and to bring together basic research and applied research. The Technical Programme will include paper presentations, invited talks, survey sessions, workshops, and tutorials. The conference is designed to cover all subfields of AI, including non-symbolic methods. ECAIs are held in alternate years and are organized by the European Coordinating Committee for Artificial Intelligence (ECCAI). The 10th ECAI in 1992 will be hosted by the Austrian Society for Artificial Intelligence (OGAI). The conference will take place at the Vienna University of Economics and Business Administration. PROGRAMME STRUCTURE Mon-Tue (Aug 3-4): Tutorials and Workshops Wed-Fri (Aug 5-7): Invited Talks, Paper Presentations, Survey Sessions Tue-Fri (Aug 4-7): Industrial Exhibition ======================== INVITED LECTURES ============================== Stanley J.Rosenschein (Teleos Research, Palo Alto, Calif., USA): Perception and Action in Autonomous Systems Oliviero Stock (IRST, Trento, Italy): A Third Modality of Natural Language? Promising Trends in Applied Natural Language Processing Peter Struss (Siemens AG, Muenchen, Germany): Knowledge-Based Diagnosis - An Important Challenge and Touchstone for AI =================== TECHNICAL PAPERS PROGRAMME ========================= This will consist of papers selected from the 680 that were submitted. These papers will be given in parallel sessions held from August 5 to 7, 1992. The topics of the papers include: - Automated Reasoning - Cognitive Modeling - Connectionist and PDP Models for AI - Distributed AI and Multiagent Systems - Enabling Technology and Systems - Integrated Systems - Knowledge Representation - Machine Learning - Natural Language - Philosophical Foundations - Planning, Scheduling, and Reasoning about Actions - Principles of AI Applications - Reasoning about Physical Systems - Robotics - Social, Economic, Legal, and Artistic Implications - User Interfaces - Verification, Validation & Test of Knowledge-Based Systems - Vision and Signal Understanding ============================ TUTORIALS ================================= --- Tutorials ----- Mon, August 3, 9:00-13:00 Applied Qualitative Reasoning Robert Milne, Intelligent Applications Ltd, Scotland, and Louise Trave-Massuyes, LAAS, Toulouse, France In Search of a New Planning Paradigm - Steps Beyond Classical Planning Joachim Hertzberg, GMD, Germany, and Sam Steel, Essex University, Cholchester, UK Machine Learning: Reality and Perspectives Lorenza Saitta, Universita di Torino, Italy --- Tutorials ----- Mon, August 3, 14:00-18:00 AI in Service and Support Anil Rewari, Digital Equipment Corp., Marlboro, Mass. Case-Based Reasoning Katia P. Sycara, Carnegie Mellon University, Pittsburgh, Penn. Computer Vision, Seeing Systems, and Their Applications Jan-Olof Eklundh, Royal Institute of Technology, Stockholm, Sweden Nonmonotonic Reasoning Gerhard Brewka, ICSI, Berkeley, Calif., and Kurt Konolige, SRI, Menlo Park, Calif. --- Tutorials ----- Tue, August 4, 9:00-13:00 Distributed AI Frank von Martial, Bonn, and Donald Steiner, Siemens AG, Germany Fuzzy Set-Based Methods for Inference and Control Henri Prade, IRIT, Universite Paul Sabatier, Toulouse, France Validation of Knowledge-Based Systems Jean-Pierre Laurent, Universite de Savoie, Chambery, France --- Tutorials ----- Tue, August 4, 14:00-18:00 Current Trends in Language Technology Harald Trost, Austrian Research Institute for AI and University of Vienna, Austria KADS: Practical, Structured KBS Development Robert Martil, Lloyd's Register of Shipping, Croydon, UK, and Bob Wielinga, University of Amsterdam, The Netherlands Neural Networks: From Theory to Applications Francoise Fogelman Soulie, Mimetics, France User Modeling and User-Adapted Interaction Sandra Carberry, University of Delaware, Newark, Delaware, and Alfred Kobsa, University of Konstanz, Germany ============================ WORKSHOPS ================================= Workshops are part of the ECAI92 scientific programme. They will give participants the opportunity to discuss specific technical topics in a small, informal environment, which encourages interaction and exchange of ideas. Persons interested in attending a workshop should contact the workshop organizer (addresses below), and the conference office (ADV) for ECAI92 registration. Note that all workshops require an early application for participation. A full description of all workshops can be obtained by sending an email to ecai92.ws at ai.univie.ac.at, which will automatically respond. --- Workshops ----- Mon, August 3 Art and AI: Art / ificial Intelligence Robert Trappl, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: robert at ai.univie.ac.at Coping with Linguistic Ambiguity in Typed Feature Formalisms Harald Trost, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: harald at ai.univie.ac.at Formal Specification Methods for Complex Reasoning Systems Jan Treur, AI Group, Dept.of Mathematics and Computer Science, Vrije Universiteit Amsterdam, De Boelelaan 108-1a, NL-1081 HV Amsterdam, The Netherlands; Fax: +31-29-6427705, Email: treur at cs.vu.nl Knowledge Sharing and Reuse: Ways and Means Nicolaas J.I. Mars, Dept.of Computer Science, University of Twente, PO Box 217, NL-7500 AE Enschede, The Netherlands; Fax: +31-53-339605, Email: mars at cs.utwente.nl Model-Based Reasoning Gerhard Friedrich, Franz Lackinger, Dept.Information Systems, CD-Lab for Expert Systems, Univ.of Technology, Paniglg.16, A-1040 Vienna; Fax: +43-1-5055304, Email: friedrich at vexpert.dbai.tuwien.ac.at Neural Networks and a New AI Georg Dorffner, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: georg at ai.univie.ac.at Scheduling of Production Processes Juergen Dorn, CD-Laboratory for Expert Systems, University of Technology, Paniglgasse 16, A-1040 Vienna, Austria; Fax: +43-1-5055304; Email: dorn at vexpert.dbai.tuwien.ac.at Validation, Verification and Test of KBS Marc Ayel, LIA, University of Savoie, BP.1104, F-73011 Chambery, France; Fax: +33-79-963475, Email: ayel at frgren81.bitnet --- Workshops ----- Tue, August 4 Advances in Real-Time Expert System Technologies Wolfgang Nejdl, Department for Information Systems, CD-Lab for Expert Systems, University of Technology, Paniglgasse 16, A-1040 Vienna, Austria; Fax: +43-1-5055304, Email: nejdl at vexpert.dbai.tuwien.ac.at Application Aspects of Distributed Artificial Intelligence Thies Wittig, Atlas Elektronik GmbH, Abt.TEF, Sebaldsbruecker Heerstrasse 235, D-W-2800 Bremen 44, Germany; Fax: +49-421-4573756, Email: t_wittig at eurokom.ie Applications of Reason Maintenance Systems Francois Charpillet, Jean-Paul Haton, CRIN/INRIA-Lorraine, B.P. 239, F-54506 Vandoeuvre-Les-Nancy Cedex, France; Fax: +33-93-413079, Email: charp at loria.crin.fr Artificial Intelligence and Music Gerhard Widmer, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: gerhard at ai.univie.ac.at Beyond Sequential Planning Gerd Grosse, FG Intellektik, TH Darmstadt, Alexanderstr.10, D-6100 Darmstadt, Germany; Fax: +49-6151-165326, Email: grosse at intellektik.informatik.th-darmstadt.de Concurrent Engineering: Requirements for Knowledge-Based Design Support Nel Wognum, Dept. of Computer Science, University of Twente, P.O.Box 217, NL-7500 AE Enschede, The Netherlands; Fax: +31-53-339605, Email: wognum at cs.utwente.nl Improving the Use of Knowledge-Based Systems with Explanations Patrick Brezillon, CNRS-LAFORIA, Box 169, University of Paris VI, 2 Place Jussieu, F-75252 Paris Cedex 05, France; Fax: +33-1-44277000, Email: brezil at laforia.ibp.fr The Theoretical Foundations of Knowledge Representation and Reasoning Gerhard Lakemeyer, Institut f.Informatik III, Universitaet Bonn, Roemerstr.164, D-W-5300 Bonn 1, Germany; Fax: +49-228-550382, Email: gerhard at uran.informatik.uni-bonn.de --- Workshops ----- Mon and Tue, August 3-4 Expert Judgement, Human Error, and Intelligent Systems Barry Silverman, Institute for AI, George Washington University, 2021 K St. NW, Suite 710, Washington, DC 20006, USA; Fax: (202)785-3382, Email: barry at gwusun.gwu.edu Logical Approaches to Machine Learning Celine Rouveirol, Universite Paris-Sud, LRI, Bat 490, F-91405 Orsay, France; Fax: +33-1-69416586, Email: celine at lri.lri.fr Spatial Concepts: Connecting Cognitive Theories with Formal Representations Simone Pribbenow, Email: pribbeno at informatik.uni-hamburg.de, and Christoph Schlieder, Institut f.Informatik und Gesellschaft, Friedrichstr.50, D-7800 Freiburg, Germany; Fax: +49-761-2034653, Email: cs at cognition.iig.uni-freiburg.de ======================== GENERAL INFORMATION =========================== DELEGATE'S FEE (in Austrian Schillings, approx. 14 AS = 1 ECU, 12 AS = 1 US$) early late on-site (rec.before) (Jun 1) (Jul 15) Members of ECCAI member organizations 4.500,- 5.000,- 6.000,- Non-Members 5.000,- 6.000,- 7.000,- Students 1.500,- 2.000,- 2.500,- The delegate's fee covers attendance at the scientific programme (invited talks, paper presentations, survey sessions, and workshops), conference documentation including the conference proceedings, admission to the industrial exhibition, and participation in selected evening events. TUTORIAL FEE (per tutorial) early late on-site (rec.before) (Jun 1) (Jul 15) Members of ECCAI member organizations 3.000,- 3.500,- 4.000,- Non-Members 3.500,- 4.000,- 4.500,- Students 1.500,- 2.000,- 2.500,- Tutorial Registration entitles to admission to that tutorial, admission to the exhibition, a copy of the course material, and refreshments during the tutorial. ACCOMODATION Hotels of different price categories, ranging from DeLuxe to the very cheap student hostel (available for non-students too), are available for the first week of August. The price ranges (in AS) are given below. Hotel Category single room double room with bath without bath with bath without bath DeLuxe ***** 1690,-/2375,- 2400,-/3200,- A **** 990,-/1300,- 1400,-/1790,- B *** 750,-/980,- 1100,-/1350,- Season Hotel 480,-/660,- 335,-/450,- 780,-/900,- 580,-/730,- Student Hostel 220,- 380,- The conference venue is located in a central district of Vienna. It can be reached easily by public transport. ============================ REGISTRATION ============================== For detailed information and registration material please contact the conference office: ADV c/o ECAI92 Trattnerhof 2 A-1010 Vienna, Austria Tel: +43-1-5330913-74, Fax: +43-1-5330913-77, Telex: 75311178 adv a or send your postal address via email to: ecai92 at ai.univie.ac.at  From mcolthea at laurel.ocs.mq.edu.au Mon Mar 23 17:24:19 1992 From: mcolthea at laurel.ocs.mq.edu.au (Max Coltheart) Date: Tue, 24 Mar 92 08:24:19 +1000 Subject: No subject Message-ID: <9203232224.AA12889@laurel.ocs.mq.edu.au> Models Of Reading Aloud: Dual-Route And Parallel-Distributed-Processing Approaches Max Coltheart, Brent Curtis and Paul Atkins School of Behavioural Sciences Macquarie University Sydney NSW 2109 Australia email: max at currawong.mqcc.mq.oz.au Submitted for publication March 23, 1992. Abstract It has often been argued that various facts about skilled reading aloud cannot be explained by any model unless that model possesses a dual-route architecture: one route from print to speech that may be described as lexical (in the sense that it operates by retrieving pronunciations from a mental lexicon) and another route from print to speech that may be described as non-lexical (in the sense that it computes pronunciations by rule, rather than by retrieving them from a lexicon). This broad claim has been challenged by Seidenberg and McClelland (1989, 1990). Their model has but a single route from print to speech, yet, they contend, it can account for major facts about reading which have hitherto been claimed to require a dual-route architecture. We identify six of these major facts about reading. The one-route model proposed by Seidenberg and McClelland can account for the first of these, but not the remaining five: how people read nonwords aloud, how they perform visual lexical decision, how two particular forms of acquired dyslexia can arise, and how different patterns of developmental dyslexia can arise. Since models with dual-route architectures can explain all six of these basic facts about reading, we suggest that this remains the viable architecture for any tenable model of skilled reading and learning to read. Preprints available from MC at the above address.  From terry at jeeves.UCSD.EDU Wed Mar 25 15:14:36 1992 From: terry at jeeves.UCSD.EDU (Terry Sejnowski) Date: Wed, 25 Mar 92 12:14:36 PST Subject: Neural Computation 4:2 Message-ID: <9203252014.AA02301@jeeves.UCSD.EDU> Neural Computation Volume 4, Issue 2, March 1992 Review First and Second-Order Methods for Learning: Steepest Descent and Newton's Method Roberto Battiti Article Efficient Simplex-Like Methods for Equilibria of Nonsymmetric Analog Networks Douglas A. Miller and Steven W. Zucker Note A Volatility Measure for Annealing in Feedback Neural Networks Joshua Alspector, Torsten Zeppenfeld and Stephan Luna Letters What Does the Retina Know about Natural Scenes? Joseph J. Atick and A. Norman Redlich A Simple Network Showing Burst Synchronization without Frequency-Locking Christof Koch and Heinz Schuster On a Magnitude Preserving Iterative MAXnet Algorithm Bruce W. Suter and Matthew Kabrisky A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks Jurgen Schmidhuber Learning Complex, Extended Sequences Using the Principle of History Compression Jurgen Schmidhuber How Tight are the Vapnik-Chervonenkis Bounds? David Cohn and Gerald Tesauro Working Memory Networks for Learning Temporal Order with Application to 3-D Visual Object Recognition Gary Bradski, Gail A. Carpenter, and Stephen Grossberg ----- SUBSCRIPTIONS - VOLUME 4 - BIMONTHLY (6 issues) ______ $40 Student ______ $65 Individual ______ $150 Institution Add $12 for postage and handling outside USA (+7% for Canada). (Back issues from Volumes 1-3 are regularly available for $28 each.) ***** Special Offer -- Back Issues for $17 each ***** MIT Press Journals, 55 Hayward Street, Cambridge, MA 02142. (617) 253-2889. -----  From bap at james.psych.yale.edu Wed Mar 25 12:09:56 1992 From: bap at james.psych.yale.edu (Barak Pearlmutter) Date: Wed, 25 Mar 92 12:09:56 -0500 Subject: Why batch learning is slower In-Reply-To: "Thomas H. Hildebrandt "'s message of Fri, 20 Mar 92 10:37:57 -0500 <9203201537.AA04951@athos.csee.lehigh.edu> Message-ID: <9203251709.AA19744@james.psych.yale.edu> For a quadratic form minimum, in the batch case, without momentum, it is well known that a stepsize eta<2/lmax (where lmax = max eigenvalue of Hessian) gives convergence to the minimum. However, this is not true in the online or "stochastic gradient descent" case. In that case, a fixed stepsize leads to convergence to a neighborhood of the minimum, where the size of the neighborhood is determined by the stepsize, since it amplifies the noise in the noisy samples of the gradient. For this reason, it is necessary for the stepsize to go to zero in the limit in the online case. In fact, it can be shown that $\sum_t \eta(t)$ must go to infinity to prevent convergence to a non-minimum, while $\sum_t \eta(t)^2$ must not go to infinity, to guarantee convergence. If these two conditions are satisfied, convergence to a minimum is achieved with probability 1. Of course, you are studing the "cyclic online presentation" case, which, although it fulfills the conditions of stochastic gradient, also has some additional structure. However, it is easy to convince yourself that this case does not permit a fixed step size. Consider a 1-d system, with two patterns, one of which gives $E_1 = (w-1)^2$ and the other $E_2 = (w+1)^2.$ Notice that $E = E_1 + E_2$ has a minimum at $w=0$. But with a step size that does not go to zero, $w$ will flip back and forth forever. --Barak Pearlmutter.  From p-mehra at uiuc.edu Thu Mar 26 01:17:36 1992 From: p-mehra at uiuc.edu (Pankaj Mehra) Date: Thu, 26 Mar 92 00:17:36 CST Subject: Debate on recent paper by Amari Message-ID: <9203260617.AA20244@rhea> Fellow Connectionists: I have recently come across some thought-provoking debate between Dr. Andras Pellionisz and Prof. Shun-Ichi Amari regarding Prof. Amari's recent paper in Neural Networks. (Refer to an earlier message from me 09/11/91 "Exploiting duality to analyze ANNs") The paper in question is: Shun-Ichi Amari, ``Dualistic Geometry of the Manifold of Higher-Order Neurons,'' Neural Networks, vol. 4, pp. 443-451, 1991. At the risk of irritating some of you, I am reproducing excerpts of this debate from recent issues of Neuron Digest, a moderated discussion forum. I urge you to (i) continue the debate on that forum instead of swamping connectionists' mailing list with responses; and, (ii) refrain from sending me e-mail responses. I understand that long messages to the entire list are discouraged but feel compelled to post this message because of its obvious relevance to all of us. Following are the excerpts of recent debate gleaned from Neuron Digest, Vol. 9, Issues 8-10,14. Contact moderator for subscription, submissions. Old issues are available via FTP from cattell.psych.upenn.edu (128.91.2.173). In what follows, "Editor's Notes" refer to the moderator's comments, not mine. -Pankaj Mehra University of Illinois ____________________________________________________________________________ Neuron Digest Saturday, 22 Feb 1992 Volume 9 : Issue 8 Today's Topics: Open letter to Dr. Sun-Ichi Amari ------------------------------ From hirsch at math.berkeley.edu Mon Mar 2 20:23:03 1992 From: hirsch at math.berkeley.edu (Morris W. Hirsch) Date: Mon, 02 Mar 92 17:23:03 -0800 Subject: Reply to Pellionisz' "Open Letter" Message-ID: DEar Michael [Arbib]: Bravo for your reply to Pellionisz, and for sending it to Neuron Digest! I was going to reply to ND along similar but less knowledgeable lines-- I may still do so if the so-called dispute continues. YOurs, --MOE Professor Morris W. Hirsch Department of Mathematics University of California Berkeley, CA 94720 USA e-mail: hirsch at math.berkeley.edu Phones: NEW area code 510: Office, 642-4318 Messages, 642-5026, 642-6550 Fax: 510-642-6726 ------------------------------ From Konrad.Weigl at sophia.inria.fr Wed Mar 4 06:24:22 1992 From: Konrad.Weigl at sophia.inria.fr (Konrad Weigl) Date: Wed, 04 Mar 92 12:24:22 +0100 Subject: Arbib's response to "open letter to Amari" Message-ID: Ref. Dr. Arbib's answer in Volume Digest V9 #9 I am not familiar enough with the work of Pellionisz, or proficient enough in the mathematics of General Spaces to judge upon the mathematical rigour of his work; however, whatever that rigour, as far as I know, he was the first to link the concept of Neural Networks with non-euclidian Geometry at all. That he did so to analyze the dynamics of biological Neural networks, and Dr. Amari used non-euclidian Geometry later to give a metric to spaces of different types of Neural Networks, and a geometric interpretation of learning, does not change one iota of that fact above. This is not to denigrate Dr. Amari's contribution to the field, of course. Konrad Weigl Tel. (France) 93 65 78 63 Projet Pastis Fax (France) 93 65 76 43 INRIA-Sophia Antipolis email Weigl at sophia.inria.fr 2004 Route des Lucioles B.P. 109 06561 Valbonne Cedex France ------------------------------ From ishimaru at hamamatsu-pc.ac.jp Sat Mar 7 07:58:59 1992 From: ishimaru at hamamatsu-pc.ac.jp (Kiyoto Ishimaru) Date: Sat, 07 Mar 92 14:58:59 +0200 Subject: Pellionisz' "Open Letter" Message-ID: Dear Moderator: The recent response of Dr. Arbib brought my attention. The following is my opinion about the issue: 1) Citing or not-citing in a paper should not be decided based on KINDNESS to earlier related researches, but rather KINDNESS to those who are supposed to read or to come across the prospective paper. 2) Similarity or dissimilarity argument, based on the "subjective" Riemannian space, lasts forever without any positive results. The most important thing is who was the first person having brought the tensor analysis "technique" into NN field. This should be discussed. 3) Political or social issue, such as Japan-bashing and fierce competition in R&D World-wide, should not be taken into account on this issue. This kind of discussion style does not bring any fruitful results, but makes the issue more complicated and rather worse, and intangible. 4) Dr Amari's comments in his letter, quoted by Dr. Pellionisz, "Indeed, when I wrote that paper, I thought to refer to your paper", and " But if I did so, I could only state that it is nothing to do with the geometrical appraoch that I initiated" may result in the conclusion: If Dr. Pellionisz' paper were nothing to do with Dr. Amari's, citing Dr. Pellionisz' would not have come across in his mind at all(but he actually did think over). Direct public comments from Dr. Amari is urged on this respect and others in order for both of them to get a fair jugement. Please be sure that I made the comment with an awkward feeling due to lack of deep background with respect to the tensor analysis effect on NN field. However, I am pleased with expressing my self on this e-mail. Thank you. Sincerely Yours Kiyoto Ishimaru Dept of Computer Science Hamamatsu Polytechnic College 643 Norieda Hamamatsu 432 Japan e-mail: ishimaru at hamamatsu-pc.ac.jp ------------------------------ From amari at sat.t.u-tokyo.ac.jp Tue Mar 10 11:16:32 1992 From: amari at sat.t.u-tokyo.ac.jp (Shun-ichi Amari) Date: Tue, 10 Mar 92 18:16:32 +0200 Subject: reply to the open letter to Amari Message-ID: [[ Editor's Note: In a personal note, I thanked Dr. Amari for his response. I had assumed, incorrectly, that Dr. Pellionisz had sent a copy to Dr. Amari who is not a Neuron Digest subscriber. I'm sure all readers will remember that Neuron Digest is not a peer-referreed journal but an informal forum for electronic communication. I hope the debate can come to a fruitful conclusion -PM ]] Dear Editor : Professor Usui at Toyohashi Institute of Technology and Science kindly let me know that there is an "open letters to Amari" in Neuron Digest. I was so surprised that an open letter to me was published without sending it to me. Moreover, the letter requires me to answer repeatedly what I have already answered to Dr. Pellionisz. I again try to repeat my answer in more detail. Reply to Dr. Pellionisz by Shun-ichi Amari 1. Dr. Pellionisz accused me that I have two contradictory opinions : 1) My work is a generalization of his and 2) my approach is nothing to do with his. This is incorrect. Once one reads my paper ("Dualistic geometry of the manifold of higher-order neurons", Neural Networks, vol. 4 (1991), pp. 443-451; see also another paper "Information geometry of Boltzmann machines" by S. Amari, K. Kurata and H. Nagaoka, IEEE Trans. on Neural Networks, March 1992), it is immediately clear 1) that my work is never a generalization of his and 2) more strongly that it has nothing to do with Pellionisz' work. Dr. Pellionisz seems accusing me without reading or understanding my paper at all. I would like to ask the readers to read my paper. For those readers who have not yet read my paper, I would like to compare his work with mine in the following, because this is what Dr. Pellionisz has carefully avoided. 2. We can summarize his work in that the main function of the cerebellum is a transformation of a covariant vector to a contravariant vector in a metric Euclidean space since non-orthogonal reference bases are used in the brain. He mentioned verbally non-linear generalizations and so on, but nothing scientific has been done along this line. 3. In my 1991 paper, I proposed a geometrical theory of the manifold of parameterized non-linear systems, with special reference to the manifold of non-linear higher order neurons. I did not focus on any functions of a neural network but the mutual relations among different neural networks such as the distance of two different neural networks, the curvature of a family of neural networks and its role, etc. Here the method of information geometry plays a fundamental role. It uses a dual pair of affine connections, which is a new concept in differential geometry, and has been proved to be very useful for analyzing statistical inference problems, multiterminal information theory, the manifold of linear control systems, and so on (see S.Amari, Differential Geometrical Methods of Statistics, Springer Lecture Notes in Statistics, vol.28, 1985 and many papers referred to in its second printing). Now a number of mathematicians are studying on this new subject. I have shown that the same method of information geometry is applicable to the manifold of neural networks, elucidating the capabilities and limitations of a family of neural networks in terms of their architecture. I have opened, I believe, a new fertile field of studying, not the behaviors of single neural networks, but the collective properties of the set or the manifold of neural networks in terms of new differential geometry. 4. Now we can discuss the point. Is my theory a generalization of his theory? Definitely No. If A is a generalization of B, A should include B as a special example. My theory does never include any of his tensorial transformations. A network is merely a point of the manifold in my theory. I have studied a collective behaviors of the manifold but have not studied properties of points. 5. The second point. One may ask that, even if my theory is not a generalization of his theory, it might have something to do with his theory so that I should have referred to his work. The answer is again no. Dr. Pellionisz insists that he is a pioneer of tensor theory and my theory is also tensorial. This is not true. My theory is differential-geometrical, but it does not require any tensorial notation. Modern differential geometry has been constructed without using tensorial notations, although it is sometimes convenient to use them. As one sees from my paper, its essential part is described without tensor notations. In differential geometry, what is important is intrinsic structures of manifolds such as affine connections, parallel transports, curvatures, and so on. The Pellionisz theory has nothing to do with these differential-geometrical concepts. He used the tensorial notation to point out that the role of the cerebellum is a special type of linear transformations, namely a covariant-contravariant linear transformation, which M.A.Arbib and myself have criticized. 6. Dr. Pellionisz claims that he is the pioneer of the tensorial theory of neural networks. Whenever one uses a tensor, should he refer to Pellionisz'? This is rediculus. Who does claim that he is the pioneer of using differential equations, linear algebra, probability theory, etc. in neural network theory? It is just a commonly used method. Moreover, the tensorial method itself had been used since the old time in neural network research. For example, in my 1967 paper (S. Amari, A mathematical theory of adaptive pattern classifiers, IEEE Trans. on EC, vol.16, pp.299-307) where I proposed the general stochastic gradient learning method for multilayer networks, I used the metric tensor C (p.301) in order to transform a covariant gradient vector to the corresponding contravariant learning vector. But I suppressed the tensor notation there. However, in p.303, I explicitly used the tensorial notation in order to analyze the dynamic behavior of modifiable parameter vectors. I never claim that Dr. Pellionisz should refer to this old paper, because the tensorial method itself is of common use to all applied mathematicians, and my old theory is nothing to do with his except that both used a covariant-contravariant transformation and tensorial notations, a common mathematical concept. 7. I do not like non-productive, time-consuming and non-scientific discussions like this. If one reads my paper, everything will be melted away. This is nothing to do with the fact that I am unfortunately a coeditor-in-chief of Neural Networks, a threaten on the intellectual properties (of tensorial theory), the world-wide competition of scientific research, etc. which Dr. Pellionisz hinted as if such were in the background. Instead, this reminded me of the horrible days when Professor M. A. Arbib and myself were preparing the righteous criticism to his theory (not a criticism to using the tensor concept but to his theory itself). I had received astonishing interference repeatedly, which I hope would never happen again. I will disclose my e-mail letter to Pellionisz in the following, hoping that he discloses his first letter including his unbelievable request, because it makes the situation and his desire clear. The reader would understand why I do not want to continue fruitless discussions with him. I also request him to read my paper and to point out which concepts or theories in my paper are generalizations of his. The folowing is my old reply to Dr.Pellionisz which he partly referred to in his "open letter to Amari". Dear Dr. Pellionisz: Thank you for your e-mail remarking my recent paper entitled "Dualistic geometry of the manifold of higher-order neurons". As you know very well, we criticized your idea of tensorial approach in our memorial joint paper with M.Arbib. The point is that, although the tensorial approach is welcome, it is too restrictive to think that the brain function is merely a transformation between contravarian vectors and covariant vectors; even if we use linear approximations, the transformation should be free of the positivity and symmetry. As you may understand these two are the essential restrictions of covariant-contravariant transformations. You have interests in analyzing a general but single neural network. Of course this is very important. However, what I am interested in is to know a geometrical structures of a set of neural networks (in other words, a set of brains). This is a new object of research. Of course, I did some work along this line in statistical neurodynamics where a probability measure is introduced in a manifold of neural networks, and physicists later have followed a similar idea (E.Gardner and others). However, a geometrical structure is implicit. As you noted, I have written that my paper opens a new fertile field of neural network research, in the following two senses: First, that we are treating a set of networks, not the behavior of a single network. There are vast number of researches on single networks by analytical, stochastic, tensorial and many other mathematical methods. The point is to treat a new object of research, a manifold of neural networks. Secondly, I have proposed a new concept of dual affine connections, which mathematician have recently been studying in more detail as mathematical research. So if you have studied the differential geometrical structure of a manifold of neural networks, I should refer to it. If you have proposed a new concept of duality in affine connections, I should refer to it. If you are claiming that you used tensor analysis in analyzing behaviors of single neural networks, it is nothing to do with the field which I have opened. Indeed, when I wrote that paper, I thought to refer to your paper. But if I did so, I could only state that it is nothing to do with this new approach. Moreover, I need to repeat our memorial criticism again. I do not want to do such irrelevant discussions. If you read my paper, I think you understand what is newly opened by this approach. Since our righteous criticism to your memorable approach has been published, we do not need to repeat it again and again. I do hope your misunderstanding is resolved by this mail and by reading my paper. Sincerely yours, Shun-ichi Amari ------------------------------ Neuron Digest Wednesday, 25 Mar 1992 Volume 9 : Issue 14 Today's Topics: Is it time to change our referring? ("Open Letter" debate) ------------------------------ From Miklos.Boda at eua.ericsson.se Wed Mar 25 10:50:35 1992 From: Miklos.Boda at eua.ericsson.se (Miklos.Boda@eua.ericsson.se) Date: Wed, 25 Mar 92 16:50:35 +0100 Subject: Is it time to change our referring? ("Open Letter" debate) Message-ID: [[ Editor's Note: Another contribution to the ongoing discussion prompted by Pellionisz' "Open Letter" from several issues ago. I think the following deals with some of the larger issues of citation which are important to consider. -PM ]] Dear Moderator: IS IT TIME TO CHANGE OUR REFERRING ? My contribution to the issue : "referring or not" initiated by "Open Letter" of Dr. Pellionisz: 1. It is obvious that Dr. Pellionisz introduced a brilliant concept when he brought the tensor analysis approach into Neural Network research. Despite any imperfection that may exist in a new approach to the mathematics of General Neural Spaces, his theory has already been used by several followers and its influence is undeniable. Dr. Amari certainly has the right of claiming that his differential- geometrical approach has nothing to do with earlier comparable (tensor) approaches. However, readers would be left uncertain, if an oversight occured and questions would be put to Dr. Amari how he compares his approach to Pellionisz'. 2. The issue of referring or not, is unfortunately, a classic problem (mostly between former colleagues who have a grudge against each other, or between international competitors; see similar debate in AIDS research recently.) The problem could be solved, only if we start openly talking about this serious issue, even if it is sometimes felt inconvenient to do so. (Thus it was a good and a brave move of Dr. Pellionisz that he brought the subject up). Why do we use reference lists at all? a. First, we must list all titles which we were really using. b. Second, by tradition, we are helping the reader to give them some basic references for a better general understanding. c. Third, we establish the claims of our paper over comparable approaches. I.e. claiming the novelty of ideas, that only superficially seem related to other approaches.. I think Dr. Amari may have had point a. in mind, whereas Dr. Pellionisz may consider at least points a. and c. points important, those who wish to be kind to the reader would also consider point b. 4. Maybe it is time now to change our habits, and adapt to the new computerized literature-search, when we can find "comparables" by looking for keywords. Declaring proper keywords could therefore replace references (anyone who searches in the literature of neural networks for "tensor" will get his hands full of Pellionisz' papers). Using such method one could restrict citation to those items that one actually uses. This new method, so far, is not universally accepted, and would not state the authors claims over comparable approaches. No search for "AIDS virus" would settle claims who pioneered an approach, and the claims themselves must originate from authors. 5. More over etiquette of debate: I'll hope, Dr. Arbib already regret his precipitate remarks. (ND v9#9). Miklos Boda Ellemtel, Telecommunication Systems Laboratories Box 1505 125 25 Alvsjo Sweden ------------------------------  From pagre at weber.UCSD.EDU Thu Mar 26 18:58:07 1992 From: pagre at weber.UCSD.EDU (Phil Agre) Date: Thu, 26 Mar 92 15:58:07 -0800 Subject: AI Journal Message-ID: <9203262358.AA21526@weber> Stan Rosenschein and I are editing a special issue of the AI Journal on "Computational Theories of Interaction and Agency". We started from the observation that a wide variety of people in AI and cognitive science are using principled characterizations of interactions between agents and their environments to guide their theorizing and designing and modeling. Some connectionist projects I've heard about fit this description as well, and people engaged in such projects would be most welcome to contribute articles to our special issue. I've enclosed the call for papers. Please feel free to pass it along to anyone who might be interested. And I can send further details to anyone who's curious. Thanks very much. Phil Agre, UCSD Artificial Intelligence: An International Journal Special Issue on Computational Theories of Interaction and Agency Edited by Philip E. Agre (UC San Diego) and Stanley J. Rosenschein (Teleos Research) Call for Papers Recent computational research has greatly deepened our understanding of agents' interactions with their environments. The first round of research in this area developed `situated' and `reactive' architectures that interact with their environments in a flexible way. These `environments', however, were characterized in very general terms, and often purely negatively, as `uncertain', `unpredictable', and the like. In the newer round of research, psychologists and engineers are using sophisticated characterizations of agent-environment interactions to motivate explanatory theories and design rationales. This research opens up a wide variety of new issues for computational research. But more fundamentally, it also suggests a revised conception of computation itself as something that happens in an agent's involvements in its world, and not just in the abstractions of its thought. The purpose of this special issue of Artificial Intelligence is to draw together the remarkable variety of computational research that has recently been developing along these lines. These include: * Task-level robot sensing and action strategies, as well as projects that integrate classical robot dynamics with symbolic reasoning. * Automata-theoretic formalizations of agent-environment interactions. * Studies of "active vision" and related projects that approach perception within the broader context of situated activity. * Theories of the social conventions and dynamics that support activity. * Foundational analyses of situated computation. * Models of learning that detect regularities in the interactions between an agent and its environment. This list is only representative and could easily be extended to include further topics in robotics, agent architectures, artificial life, reactive planning, distributed AI, human-computer interaction, cognitive science, and other areas. What unifies these seemingly disparate research projects is their emerging awareness that the explanation and design of agents depends on principled characterizations of the interactions between those agents and their environments. We hope that this special issue of the AI Journal will clarify trends in this new research and take a first step towards a synthesis. The articles in the special issue will probably also be reprinted in a book to be published by MIT Press. The deadline for submitted articles is 1 September 1992. Send articles to: Philip E. Agre Department of Communication D-003 University of California, San Diego La Jolla, California 92093-0503 Queries about the special issue may be sent to the above address or to pagre at weber.ucsd.edu. Prospective contributors are encouraged to contact the editors well before the deadline.  From hinton at ai.toronto.edu Fri Mar 27 10:20:09 1992 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Fri, 27 Mar 1992 10:20:09 -0500 Subject: attempted character assasination Message-ID: <92Mar27.102016edt.319@neuron.ai.toronto.edu> Amari has made profound and original contributions to the neural network field and I think people should read and understand his papers before they clutter up our mailing list with the wild accusations on neuron-digest. Geoff  From thildebr at aragorn.csee.lehigh.edu Sun Mar 29 10:07:02 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Sun, 29 Mar 92 10:07:02 -0500 Subject: Why batch learning is slower In-Reply-To: Barak Pearlmutter's message of Wed, 25 Mar 92 12:09:56 -0500 <9203251709.AA19744@james.psych.yale.edu> Message-ID: <9203291507.AA09353@aragorn.csee.lehigh.edu> Date: Wed, 25 Mar 92 12:09:56 -0500 From: Barak Pearlmutter Reply-To: Pearlmutter-Barak at CS.YALE.EDU Begin Pearlmutter quote: Of course, you are studing the "cyclic online presentation" case, which, although it fulfills the conditions of stochastic gradient, also has some additional structure. However, it is easy to convince yourself that this case does not permit a fixed step size. Consider a 1-d system, with two patterns, one of which gives $E_1 = (w-1)^2$ and the other $E_2 = (w+1)^2.$ Notice that $E = E_1 + E_2$ has a minimum at $w=0$. But with a step size that does not go to zero, $w$ will flip back and forth forever. --Barak Pearlmutter. End quote One of the interesting results of the paper is that the rapidity of convergence can be judged in terms of the redundancy of the training set, where (as mentioned in the announcement) redundancy is measured in terms of the correlation (inner product) between pairs of samples. A more thorough analysis is needed, but at first glance, it appears that if the redundancy (collinearity, correlation) measure is $R \in [-1, 1]$, then the convergence rate $C$ (how fast the algorithm approaches the minimum) is given by C = 1 / (1 - abs(R)). One may consider the most redundant pair of samples to dominate the convergence rate (one place where more analysis is needed). If a pair of samples is collinear, then as you have pointed out, the convergence rate goes to zero. The above equation gives this directly, since the redundancy R is 1 in that case. If all of the samples are orthogonal, on the other hand, the per-sample algorithm will find the absolute minimum in one epoch. For intermediate degrees of collinearity, $C$ gives a measure of the severity of the "cross-stitching" which the MGD algorithm will suffe, numbers closer to zero indicating a slower convergence. All of the above discussion is in terms of the linear network described in my paper, but it is hard to see how adding nonlinnearity to the problem can improve the situation, except by chance. Thomas H. Hildebrandt CSEE Department Lehigh University  From thildebr at aragorn.csee.lehigh.edu Mon Mar 30 10:47:48 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Mon, 30 Mar 92 10:47:48 -0500 Subject: Paper on Neocognitron Training avail on neuroprose In-Reply-To: David Lovell's message of Thu, 12 Mar 92 10:48:29 EST <9203120048.AA02305@c10.elec.uq.oz.au> Message-ID: <9203301547.AA10142@aragorn.csee.lehigh.edu> About two weeks ago, David Lovell posted to CONNECTIONISTS advertising the note which he has placed in the neuroprose archive. I have a few comments on the paper. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% A NOTE ON A CLOSED-FORM TRAINING ALGORITHM FOR THE NEOCOGNITRON David Lovell, Ah Chung Tsoi & Tom Downs Intelligent Machines Laboratory, Department of Electrical Engineering University of Queensland, Queensland 4072, Australia In this note, a difficulty with the application of Hildebrandt's closed-form training algorithm for the neocognitron is reported. In applying this algorithm we have observed that S-cells frequently fail to respond to features that they have been trained to extract. We present results which indicate that this training vector rejection is an important factor in the overall classification performance of the neocognitron trained using Hildebrandt's procedure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% I my paper: "Optimal training of thresholded linear correlation classifiers", IEEE Transactions on Neural Networks, Nov 1991. a one-step training procedure is outlined, which configures the cells in a layer of Fukushima-type neurons, so that their classification regions are mutually exclusive. Starting at 3 dimensions, this mutual exclusion condition causes gaps to open up among the cones that define the classification regions. An input pattern which falls into one of these gaps will be rejected. As the number of dimensions considered increases, so does the relative volume assigned to these rejection regions. In the general linear model presented in the paper, if the number of classes is fewer than the number of dimensions, then the half-angle at the vertex of the cone is set to 45 degrees, in order to obtain mutual exclusion. On the other hand, to obtain complete coverage (no rejections), it is necessary to choose an angle which depends on the number of dimensions as follows: vertex half-angle = 1 / sqrt(dimensions) So for the first 10 dimensions, the angles at which complete coverage is achieved are given by (in degrees): 0 45 54.73 60 63.43 65.9 67.79 69.29 70.53 71.56 In Fukushima's training procedure, classification regions compete for training patterns, so that in its final state, the configuration of the network more closely resembles one which achieves complete coverage than one which achieves mutual exclusion of the classification regions. In classification problems, the correct classification rate and the rejection rate are fundamentally in opposition. Therefore, it is not surprising that the network trained using Fukushima's procedure achieved a higher classification rate than the one trained using my one-step training procedure. A fairer comparison would be obtained by relaxing the mutual exclusion of REGIONS in the latter network to the mutual exclusion of SAMPLES (i.e. that a training sample falls in one and only one classification region). In that case, the rejection rate for my network is expected to be lower, in general, and the classification rate correspondingly higher. Thomas H. Hildebrandt CSEE Department Lehigh University  From dhg at scs.carleton.ca Mon Mar 30 19:30:58 1992 From: dhg at scs.carleton.ca (Daryl Herbert Graf) Date: Mon, 30 Mar 92 19:30:58 EST Subject: Theory of Neural Group Selection Message-ID: <9203310030.AA23100@scs.carleton.ca> I am looking for critical reviews of Gerald Edelman's theory of neuronal group selection. I have read the material in the attached bibliography and I would like to balance this with additional papers, analyses, observations, or opinions regarding this work from the connectionist, and neuroscience communities. In order to avoid cluttering the news, I would ask those who wish to reply to do so directly to me. I will post a summary in the near future. Many thanks in advance. Daryl Graf Study Group on Evolutionary Computing Techniques School of Computer Science Carleton University Ottawa, Ontario, Canada email: dhg at scs.carleton.ca Bibliography: Edelman, G.M. (1987) "Neural Darwinism: the Theory of Neuronal Group Selection", Basic, NY Edelman, G.M. (1989) "The Remembered Present: a Biological Theory of Consciousness", Basic, NY Edelman, G.M. (1981) Group selection as the basis for higher brain function. In "The Organization of the Cerebral Cortex", F.O. Schmitt, F.G. Worden, G. Adelman, S.G. Dennis, eds., pp. 535-563, MIT Press, Cambridge, Mass. Edelman, G.M. (1978) Group selection and phasic reentrant signalling: a theory of higher brain function. In "The Mindful Brain", G.M. Edelman, V.B. Mountcastle, eds., pp. 51-100, MIT Press, Cambridge, Mass. Edelman, G.M., Finkel, L.H. (1984) Neuronal group selection in the cerebral cortex. In "Dynamic Aspects of Neocortical Function", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., pp. 653-695. Wiley, NY Edelman, G.M., Reeke, G.N., (1982) Selective networks capable of representative transformations, limited generalizations, and associative memory, Proc. Natl. Acad. Sci. USA 79:2091-2095 Finkel, L.H., Edelman, G.M. (1987) Population rules for synapses in networks. In "Synaptic Function", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., pp.711-757, Wiley, NY Finkel, L.H., Edelman, G.M. (1985) Interaction of synaptic modification rules within populations of neurons, Proc. Natl. Acad. Sci. USA 82:1291-1295 Reeke, G.N., Finkel, L.H., Sporns, O., G.M. Edelman, (1989) Synthetic neural modeling: a multilevel approach to the analysis of brain complexity. In "Signal and Sense: Local and Global Order in Perceptual Maps", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., Wiley, NY, pp. 607-707  From netgene at virus.fki.dth.dk Fri Mar 20 07:30:41 1992 From: netgene at virus.fki.dth.dk (virus mail server) Date: Fri, 20 Mar 92 13:30:41 +0100 Subject: HUMOPS: NetGene splice site prediction Message-ID: ------------------------------------------------------------------------ NetGene Neural Network Prediction of Splice Sites Reference: Brunak, S., Engelbrecht, J., and Knudsen, S. (1991). Prediction of Human mRNA donor and acceptor sites from the DNA sequence. Journal of Molecular Biology 220:49-65. ------------------------------------------------------------------------ Report ERRORS to Jacob Engelbrecht engel at virus.fki.dth.dk. Potential splice sites are assigned by combining output from a local and a global network. The prediction is made with two cutoffs: 1) Highly confident sites (no or few false positives, on average 50% of the true sites detected); 2) Nearly all true sites (more false positives - on average of all positions 0.1% false positive donor sites and 0.4% false positive acceptor sites, at 95% detection of true sites). The network performance on sequences from distantly related organisms has not been quantified. Due to the non-local nature of the algorithm sites closer than 225 nucleotides to the ends of the sequence cannot be assigned. Column explanations, field identifiers: POSITION in your sequence (either first or last base in intron). Joint CONFIDENCE level for the site (relative to the cutoff). EXON INTRON gives 20 bases of sequence around the predicted site. LOCAL is the site confidence from the local network. GLOBAL is the site confidence from the global network. ------------------------------------------------------------------------ The sequence: HUMOPS contains 6953 bases, and has the following composition: A 1524 C 2022 G 1796 T 1611 1) HIGHLY CONFIDENT SITES: ========================== ACCEPTOR SITES: POSITION CONFIDENCE INTRON EXON LOCAL GLOBAL 4094 0.27 TGTCCTGCAG^GCCGCTGCCC 0.63 0.66 5167 0.20 TGCCTTCCAG^TTCCGGAACT 0.59 0.64 3812 0.17 CTGTCCTCAG^GTACATCCCC 0.68 0.54 3164 0.02 TCCTCCTCAG^TCTTGCTAGG 0.79 0.32 2438 0.01 TGCCTTGCAG^GTGAAATTGC 0.78 0.33 DONOR SITES: POSITION CONFIDENCE EXON INTRON LOCAL GLOBAL 3979 0.38 CGTCAAGGAG^GTACGGGCCG 0.92 0.74 2608 0.17 GCTGGTCCAG^GTAATGGCAC 0.85 0.54 4335 0.06 GAACAAGCAG^GTGCCTACTG 0.83 0.41 2) NEARLY ALL TRUE SITES: ========================= ACCEPTOR SITES: POSITION CONFIDENCE INTRON EXON LOCAL GLOBAL 4094 0.55 TGTCCTGCAG^GCCGCTGCCC 0.63 0.66 3812 0.52 CTGTCCTCAG^GTACATCCCC 0.68 0.54 3164 0.49 TCCTCCTCAG^TCTTGCTAGG 0.79 0.32 5167 0.49 TGCCTTCCAG^TTCCGGAACT 0.59 0.64 2438 0.48 TGCCTTGCAG^GTGAAATTGC 0.78 0.33 4858 0.39 TCATCCATAG^AAAGGTAGAA 0.77 0.20 3712 0.36 CCTTTTCCAG^GGAGGGAATG 0.88 -0.01 4563 0.33 CCCTCCACAG^GTGGCTCAGA 0.81 0.05 5421 0.33 TTTTTTTAAG^AAATAATTAA 0.75 0.13 3783 0.29 TCCCTCACAG^GCAGGGTCTC 0.64 0.26 3173 0.25 GTCTTGCTAG^GGTCCATTTC 0.52 0.36 4058 0.24 CTCCCTGGAG^GAGCCATGGT 0.43 0.51 1784 0.22 TCACTGTTAG^GAATGTCCCA 0.68 0.08 6512 0.21 CCCTTGCCAG^ACAAGCCCAT 0.67 0.08 2376 0.20 CCCTGTCTAG^GGGGGAGTGC 0.61 0.16 1225 0.18 CCCCTCTCAG^CCCCTGTCCT 0.65 0.07 1743 0.13 TTCTCTGCAG^GGTCAGTCCC 0.62 0.03 3834 0.13 GGGCCTGCAG^TGCTCGTGTG 0.26 0.58 4109 0.13 TGCCCAGCAG^CAGGAGTCAG 0.29 0.54 6557 0.13 CATTCTGGAG^AATCTGCTCC 0.56 0.12 1638 0.11 CCATTCTCAG^GGAATCTCTG 0.62 0.00 247 0.10 GCCTTCGCAG^CATTCTTGGG 0.55 0.11 6766 0.09 CTATCCACAG^GATAGATTGA 0.64 -0.06 906 0.08 AATTTCACAG^CAAGAAAACT 0.61 -0.02 6499 0.08 CAGTTTCCAG^TTTCCCTTGC 0.55 0.06 378 0.07 GTACCCACAG^TACTACCTGG 0.24 0.52 3130 0.07 CTGTCTCCAG^AAAATTCCCA 0.51 0.12 4272 0.07 ACCATCCCAG^CGTTCTTTGC 0.58 0.00 4522 0.07 TGAATCTCAG^GGTGGGCCCA 0.51 0.12 5722 0.07 ACCCTCGCAG^CAGCAGCAAC 0.55 0.05 2316 0.06 CTTCCCCAAG^GCCTCCTCAA 0.40 0.27 2357 0.06 GCCTTCCTAG^CTACCCTCTC 0.39 0.28 2908 0.06 TTTGGTCTAG^TACCCCGGGG 0.51 0.10 4112 0.06 CCAGCAGCAG^GAGTCAGCCA 0.25 0.50 1327 0.05 TTTGCTTTAG^AATAATGTCT 0.52 0.06 844 0.04 GTTTGTGCAG^GGCTGGCACT 0.62 -0.11 1045 0.04 TCCCTTGGAG^CAGCTGTGCT 0.54 0.01 1238 0.03 CTGTCCTCAG^GTGCCCCTCC 0.50 0.06 2976 0.03 CCTAGTGCAG^GTGGCCATAT 0.62 -0.12 3825 0.03 CATCCCCGAG^GGCCTGCAGT 0.16 0.60 1508 0.02 TGAGATGCAG^GAGGAGACGC 0.43 0.16 2257 0.02 CTCTCCTCAG^CGTGTGGTCC 0.53 0.00 5712 0.02 ATCCTCTCAG^ACCCTCGCAG 0.51 0.05 2397 0.00 CCCTCCTTAG^GCAGTGGGGT 0.41 0.16 4800 0.00 CATTTTCTAG^CTGTATGGCC 0.47 0.07 5016 0.00 TGCCTAGCAG^GTTCCCACCA 0.59 -0.11 DONOR SITES: POSITION CONFIDENCE EXON INTRON LOCAL GLOBAL 3979 0.75 CGTCAAGGAG^GTACGGGCCG 0.92 0.74 2608 0.51 GCTGGTCCAG^GTAATGGCAC 0.85 0.54 4335 0.38 GAACAAGCAG^GTGCCTACTG 0.83 0.41 656 0.32 ACCCTGGGCG^GTATGAGCCG 0.56 0.66 5859 0.11 ACCAAAAGAG^GTGTGTGTGT 0.85 0.07 4585 0.09 GCTCACTCAG^GTGGGAGAAG 0.86 0.03 1708 0.06 TGGCCAGAAG^GTGGGTGTGC 0.85 0.01 6196 0.05 CCCAATGAGG^GTGAGATTGG 0.86 -0.01 667 0.03 TATGAGCCGG^GTGTGGGTGG 0.23 0.71 ------------------------------------------------------------------------ From Connectionists-Request at CS.CMU.EDU Sun Mar 1 00:05:16 1992 From: Connectionists-Request at CS.CMU.EDU (Connectionists-Request@CS.CMU.EDU) Date: Sun, 01 Mar 92 00:05:16 EST Subject: Bi-monthly Reminder Message-ID: <15436.699426316@B.GP.CS.CMU.EDU> *** DO NOT FORWARD TO ANY OTHER LISTS *** This is an automatically posted bi-monthly reminder about how the CONNECTIONISTS list works and how to access various online resources. CONNECTIONISTS is not an edited forum like the Neuron Digest, or a free-for-all newsgroup like comp.ai.neural-nets. It's somewhere in between, relying on the self-restraint of its subscribers. Membership in CONNECTIONISTS is restricted to persons actively involved in neural net research. The following posting guidelines are designed to reduce the amount of irrelevant messages sent to the list. Before you post, please remember that this list is distributed to over a thousand busy people who don't want their time wasted on trivia. Also, many subscribers pay cash for each kbyte; they shouldn't be forced to pay for junk mail. Happy hacking. -- Dave Touretzky & Hank Wan --------------------------------------------------------------------- What to post to CONNECTIONISTS ------------------------------ - The list is primarily intended to support the discussion of technical issues relating to neural computation. - We encourage people to post the abstracts of their latest papers and tech reports. - Conferences and workshops may be announced on this list AT MOST twice: once to send out a call for papers, and once to remind non-authors about the registration deadline. A flood of repetitive announcements about the same conference is not welcome here. - Requests for ADDITIONAL references. This has been a particularly sensitive subject lately. Please try to (a) demonstrate that you have already pursued the quick, obvious routes to finding the information you desire, and (b) give people something back in return for bothering them. The easiest way to do both these things is to FIRST do the library work to find the basic references, then POST these as part of your query. Here's an example: WRONG WAY: "Can someone please mail me all references to cascade correlation?" RIGHT WAY: "I'm looking for references to work on cascade correlation. I've already read Fahlman's paper in NIPS 2, his NIPS 3 abstract, and found the code in the nn-bench archive. Is anyone aware of additional work with this algorithm? I'll summarize and post results to the list." - Announcements of job openings related to neural computation. - Short reviews of new text books related to neural computation. To send mail to everyone on the list, address it to Connectionists at CS.CMU.EDU ------------------------------------------------------------------- What NOT to post to CONNECTIONISTS: ----------------------------------- - Requests for addition to the list, change of address and other administrative matters should be sent to: "Connectionists-Request at cs.cmu.edu" (note the exact spelling: many "connectionists", one "request"). If you mention our mailing list to someone who may apply to be added to it, please make sure they use the above and NOT "Connectionists at cs.cmu.edu". - Requests for e-mail addresses of people who are believed to subscribe to CONNECTIONISTS should be sent to postmaster at appropriate-site. If the site address is unknown, send your request to Connectionists-Request at cs.cmu.edu and we'll do our best to help. A phone call to the appropriate institution may sometimes be simpler and faster. - Note that in many mail programs a reply to a message is automatically "CC"-ed to all the addresses on the "To" and "CC" lines of the original message. If the mailer you use has this property, please make sure your personal response (request for a Tech Report etc.) is NOT broadcast over the net. - Do NOT tell a friend about Connectionists at cs.cmu.edu. Tell him or her only about Connectionists-Request at cs.cmu.edu. This will save your friend from public embarrassment if she/he tries to subscribe. - Limericks should not be posted here. ------------------------------------------------------------------------------- The CONNECTIONISTS Archive: --------------------------- All e-mail messages sent to "Connectionists at cs.cmu.edu" starting 27-Feb-88 are now available for public perusal. A separate file exists for each month. The files' names are: arch.yymm where yymm stand for the obvious thing. Thus the earliest available data are in the file: arch.8802 Files ending with .Z are compressed using the standard unix compress program. To browse through these files (as well as through other files, see below) you must FTP them to your local machine. ------------------------------------------------------------------------------- How to FTP Files from the CONNECTIONISTS Archive ------------------------------------------------ 1. Open an FTP connection to host B.GP.CS.CMU.EDU (Internet address 128.2.242.8). 2. Login as user anonymous with password your username. 3. 'cd' directly to one of the following directories: /usr/connect/connectionists/archives /usr/connect/connectionists/bibliographies 4. The archives and bibliographies directories are the ONLY ones you can access. You can't even find out whether any other directories exist. If you are using the 'cd' command you must cd DIRECTLY into one of these two directories. Access will be denied to any others, including their parent directory. 5. The archives subdirectory contains back issues of the mailing list. Some bibliographies are in the bibliographies subdirectory. Problems? - contact us at "Connectionists-Request at cs.cmu.edu". ------------------------------------------------------------------------------- How to FTP Files from the Neuroprose Archive -------------------------------------------- Anonymous FTP on archive.cis.ohio-state.edu (128.146.8.52) pub/neuroprose directory This directory contains technical reports as a public service to the connectionist and neural network scientific community which has an organized mailing list (for info: connectionists-request at cs.cmu.edu) Researchers may place electronic versions of their preprints or articles in this directory, announce availability, and other interested researchers can rapidly retrieve and print the postscripts. This saves copying, postage and handling, by having the interested reader supply the paper. (Along this line, single spaced versions, if possible, will help!) To place a file, put it in the Inbox subdirectory, and send mail to pollack at cis.ohio-state.edu. Within a couple of days, I will move and protect it, and suggest a different name if necessary. When you announce a paper, you should consider whether (A) you want it automatically forwarded to other groups, like NEURON-DIGEST, (which gets posted to comp.ai.neural-networks) and if you want to provide (B) free or (C) prepaid hard copies for those unable to use FTP. If you do offer hard copies, be prepared for an onslaught. One author reported that when they allowed combination AB, the rattling around of their "free paper offer" on the worldwide data net generated over 2000 hardcopy requests! Experience dictates the preferred paradigm is to announce an FTP only version with a prominent "**DO NOT FORWARD TO OTHER GROUPS**" at the top of your announcement to the connectionist mailing list. Current naming convention is author.title.filetype[.Z] where title is enough to discriminate among the files of the same author. The filetype is usually "ps" for postscript, our desired universal printing format, but may be tex, which requires more local software than a spooler. Very large files (e.g. over 200k) must be squashed (with either a sigmoid function :) or the standard unix "compress" utility, which results in the .Z affix. To place or retrieve .Z files, make sure to issue the FTP command "BINARY" before transfering files. After retrieval, call the standard unix "uncompress" utility, which removes the .Z affix. An example of placing a file is attached as an appendix, and a shell script called Getps in the directory can perform the necessary retrival operations. For further questions contact: Jordan Pollack Assistant Professor CIS Dept/OSU Laboratory for AI Research 2036 Neil Ave Email: pollack at cis.ohio-state.edu Columbus, OH 43210 Phone: (614) 292-4890 Here is an example of naming and placing a file: gvax> cp i-was-right.txt.ps rosenblatt.reborn.ps gvax> compress rosenblatt.reborn.ps gvax> ftp archive.cis.ohio-state.edu Connected to archive.cis.ohio-state.edu. 220 archive.cis.ohio-state.edu FTP server ready. Name: anonymous 331 Guest login ok, send ident as password. Password:neuron 230 Guest login ok, access restrictions apply. ftp> binary 200 Type set to I. ftp> cd pub/neuroprose/Inbox 250 CWD command successful. ftp> put rosenblatt.reborn.ps.Z 200 PORT command successful. 150 Opening BINARY mode data connection for rosenblatt.reborn.ps.Z 226 Transfer complete. 100000 bytes sent in 3.14159 seconds ftp> quit 221 Goodbye. gvax> mail pollack at cis.ohio-state.edu Subject: file in Inbox. Jordan, I just placed the file rosenblatt.reborn.ps.Z in the Inbox. The INDEX sentence is "Boastful statements by the deceased leader of the neurocomputing field." Please let me know when it is ready to announce to Connectionists at cmu. BTW, I enjoyed reading your review of the new edition of Perceptrons! Frank ------------------------------------------------------------------------ How to FTP Files from the NN-Bench Collection --------------------------------------------- 1. Create an FTP connection from wherever you are to machine "pt.cs.cmu.edu" (128.2.254.155). 2. Log in as user "anonymous" with password your username. 3. Change remote directory to "/afs/cs/project/connect/bench". Any subdirectories of this one should also be accessible. Parent directories should not be. 4. At this point FTP should be able to get a listing of files in this directory and fetch the ones you want. Problems? - contact us at "nn-bench-request at cs.cmu.edu".  From geiger at medusa.siemens.com Mon Mar 2 17:39:57 1992 From: geiger at medusa.siemens.com (Davi Geiger) Date: Mon, 2 Mar 92 17:39:57 EST Subject: No subject Message-ID: <9203022239.AA03797@medusa.siemens.com> CALL FOR PAPERS NEURAL INFORMATION PROCESSING SYSTEMS (NIPS) -Natural and Synthetic- Monday, November 30 - Thursday, December 3, 1992 Denver, Colorado This is the sixth meeting of an inter-disciplinary conference which brings together neuroscientists, engineers, computer scientists, cognitive scientists, physicists, and mathematicians interested in all aspects of neural processing and computation. A day of tutorial presentations (Nov 30) will precede the regular session and two days of focused workshops will follow at a nearby ski area (Dec 4-5). Major categories and examples of subcategories for paper submissions are the following; Neuroscience: Studies and Analyses of Neurobiological Systems, Inhibition in cortical circuits, Signals and noise in neural computation, Theoretical Neurobiology and Neurophysics. Theory: Computational Learning Theory, Complexity Theory, Dynamical Systems, Statistical Mechanics, Probability and Statistics, Approximation Theory. Implementation and Simulation: VLSI, Optical, Software Simulators, Implementation Languages, Parallel Processor Design and Benchmarks. Algorithms and Architectures: Learning Algorithms, Constructive and Pruning Algorithms, Localized Basis Functions, Tree Structured Networks, Performance Comparisons, Recurrent Networks, Combinatorial Optimization, Genetic Algorithms. Cognitive Science & AI: Natural Language, Human Learning and Memory, Perception and Psychophysics, Symbolic Reasoning. Visual Processing: Stereopsis, Visual Motion, Recognition, Image Coding and Classification. Speech and Signal Processing: Speech Recognition, Coding, and Synthesis, Text-to-Speech, Adaptive Equalization, Nonlinear Noise Removal. Control, Navigation, and Planning: Navigation and Planning, Learning Internal Models of the World, Trajectory Planning, Robotic Motor Control, Process Control. Applications: Medical Diagnosis or Data Analysis, Financial and Economic Analysis, Timeseries Prediction, Protein Structure Prediction, Music Processing, Expert Systems. The technical program will contain plenary, contributed oral and poster presentations with no parallel sessions. All presented papers will be due (January 13, 1993) after the conference in camera-ready format and will be published by Morgan Kaufmann. Submission Procedures: Original research contributions are solicited, and will be carefully refereed. Authors must submit six copies of both a 1000-word (or less) summary and six copies of a separate single-page 50-100 word abstract clearly stating their results postmarked by May 22, 1992 (express mail is not necessary). Accepted abstracts will be published in the conference program. Summaries are for program committee use only. At the bottom of each abstract page and on the first summary page indicate preference for oral or poster presentation and specify one of the above nine broad categories and, if appropriate, sub-categories (For example: Poster, Applications- Expert Systems; Oral, Implementation-Analog VLSI). Include addresses of all authors at the front of the summary and the abstract and indicate to which author correspondence should be addressed. Submissions will not be considered that lack category information, separate abstract sheets, the required six copies, author addresses, or are late. Mail Submissions To: Jack Cowan NIPS*92 Submissions University of Chicago Dept. of Mathematics 5734 So. University Ave. Chicago IL 60637 Mail For Registration Material To: NIPS*92 Registration SIEMENS Research Center 755 College Road East Princeton, NJ, 08540 All submitting authors will be sent registration material automatically. Program committee decisions will be sent to the correspondence author only. NIPS*92 Organizing Committee: General Chair, Stephen J. Hanson, Siemens Research & Princeton University; Program Chair, Jack Cowan, University of Chicago; Publications Chair, Lee Giles, NEC; Publicity Chair, Davi Geiger, Siemens Research; Treasurer, Bob Allen, Bellcore; Local Arrangements, Chuck Anderson, Colorado State University; Program Co-Chairs: Andy Barto, U. Mass.; Jim Burr, Stanford U.; David Haussler, UCSC ; Alan Lapedes, Los Alamos; Bruce McNaughton, U. Arizona; Barlett Mel, JPL; Mike Mozer, U. Colorado; John Pearson, SRI; Terry Sejnowski, Salk Institute; David Touretzky, CMU; Alex Waibel, CMU; Halbert White, UCSD; Alan Yuille, Harvard U.; Tutorial Chair: Stephen Hanson, Workshop Chair: Gerry Tesauro, IBM Domestic Liasons: IEEE Liaison, Terrence Fine, Cornell; Government & Corporate Liaison, Lee Giles, NEC; Overseas Liasons: Mitsuo Kawato, ATR; Marwan Jabri, University of Sydney; Benny Lautrup, Niels Bohr Institute; John Bridle, RSRE; Andreas Meier, Simon Bolivar U. DEADLINE FOR SUMMARIES & ABSTRACTS IS MAY 22, 1992 (POSTMARKED) please post 9  From STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu Mon Mar 2 09:45:00 1992 From: STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu (STAY8026%IRUCCVAX.UCC.IE@bitnet.cc.cmu.edu) Date: Mon, 2 Mar 1992 14:45 GMT Subject: Composite networks Message-ID: <01GH682VYUB400105W@IRUCCVAX.UCC.IE> Hi, I am interested in modelling tasks in which invariant information from previous input-output pairs is brought to bear on the acquisition of current input-output pairs. Thus I want to use previously extracted regularity to influence current processing. Does anyone think this is feasible?? At year's memory conference in Lancaster (England) Dave Rumelhart mentioned the need to develop nets which incorporate a distinction between learning and memory and exploit the attributes of both. Thus, our present learning procedures, such as BACKPROP, are useful for combining, without interference, multiple input-output pairs, while competitive learning systems are useful for computing regularities. How about attempting to combine the two? Has anyone tried? My suspicion is that such composite networks might be usefully applied to a number of issues in natural language processing, such as, perhaps, the syntactic embeddings considered by Pollack (1990, Artificial Intelligence). At first I thought that a sequential net of the sort discussed by Hinton and Shallice (1991 Psych. Rev.) might fit the bill, but now I'm not sure. Any ideas or suggestions? P. J. Hampson University College Cork Ireland  From elman at crl.ucsd.edu Tue Mar 3 13:14:43 1992 From: elman at crl.ucsd.edu (Jeff Elman) Date: Tue, 3 Mar 92 10:14:43 PST Subject: TR: 'Connectionism and the study of change' Message-ID: <9203031814.AA12473@crl.ucsd.edu> The following technical report is available via anonymous ftp or surface mail. Instructions on how to obtain it follow. ----------------------------------------------------------- Connectionism and the Study of Change Elizabeth A. Bates Jeffrey L. Elman Center for Research in Language University of California, San Diego CRL Technical Report 9202 Developmental psychology is not just about the study of children; it is also about the study of change. In this paper, we start with a brief historical review showing how the study of change has been aban- doned within developmental psychology, in favor of strong "preforma- tionist" views, of both the nativist variety (i.e. mental structures are present at birth, or they mature along a predetermined schedule) and the empiricist variety (i.e. mental structures are the "internali- zation" of social interactions with a competent adult). From either point of view, nothing really new can emerge across the course of development. These perspectives contrast with the truly interaction- ist view espoused by Jean Piaget, who argued for an emergence of new mental structures at the interface between organism and environment. We argue that these historical trends (including a widespread repudiation of Piaget) are rooted in the First Computer Metaphor, i.e. use of the serial digital computer as a metaphor for mind. Aspects of the First Computer Metaphor that have led to this kind of preforma- tionism include (1) discrete representations, (2) absolute rules, (3) learning as programming and/or selection from an array of pre- established hypotheses, and (4) the hardware/software distinction. We go on to argue that connectionism (i.e. the Second Computer Metaphor) offers a way out of the Nature-Nurture impasse in developmental psychology, providing useful formalizations that capture the elusive notion of emergent form. In fact, connectionist models can be used to salvage certain forms of developmental nativism, offering concrete insights into what an innate idea might really look like (or better yet, 50% of an innate idea, awaiting completion through learning). ----------------------------------------------------------- If you have access to the internet, you may obtain a copy of this report by doing the following: unix% ftp crl.ucsd.edu /* or: ftp 128.54.16.43 */ Connected to crl.ucsd.edu. 220 crl FTP server (SunOS 4.1) ready. Name (crl.ucsd.edu:elman): anonymous 331 Guest login ok, send ident as password. Password: 230 Guest login ok, access restrictions apply. ftp> binary 200 Type set to I. ftp> cd pub/neuralnets 250 CWD command successful. ftp> get tr9202.ps.Z 200 PORT command successful. 150 Binary data connection for tr9202.ps.Z (128.54.16.43,1507) (98903 bytes). 226 Binary Transfer complete. local: tr9202.ps.Z remote: tr9202.ps.Z 98903 bytes received in 0.57 seconds (1.7e+02 Kbytes/s) ftp> quit 221 Goodbye. unix% zcat tr9202.ps.Z | lpr If you are unable to obtain the TR in this manner, you may request a hardcopy by sending email to staight at crl.ucsd.edu (include your postal address).  From emelz at cognet.ucla.edu Tue Mar 3 14:39:32 1992 From: emelz at cognet.ucla.edu (Eric Melz) Date: Tue, 03 Mar 92 11:39:32 -0800 Subject: Composite networks In-Reply-To: Your message of Mon, 02 Mar 92 14:45:00 +0000. <01GH682VYUB400105W@IRUCCVAX.UCC.IE> Message-ID: <9203031939.AA10914@tinman.cognet.ucla.edu> >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? I just submitted a paper to the upcoming Cognitive Science Conference entitled "Developing Microfeatures by Analogy". The paper describes how a localist constraint-satisfaction model of analogical mapping can be used to guide the formation of the internal representations developed by backpropagation. Applied to Hinton's (1986) model of family-tree learning, this hybrid model exhibits near-perfect generalization performance when as many as 25% of the training cases are removed from the training corpus, compared to near 0% generalization when only backprop is used. If you (or anyone else) are interested in obtaining a hardcopy of this paper, send email directly to me. Eric Melz UCLA Department of Psychology  From pablo at cs.washington.edu Tue Mar 3 15:22:12 1992 From: pablo at cs.washington.edu (David Cohn) Date: Tue, 3 Mar 92 12:22:12 -0800 Subject: Composite networks Message-ID: <9203032022.AA21552@june.cs.washington.edu> P. J. Hampson (STAY8026%IRUCCVAX.UCC.IE at bitnet.cc.cmu.edu) asks: > I am interested in modelling tasks in which invariant information from > previous input-output pairs is brought to bear on the acquisition of current > input-output pairs. Thus I want to use previously extracted regularity to > influence current processing. ... I'm not sure I've read the intent of the posting correctly, but this sounds like it may be able to draw on some of the recent work in "active" learning systems. These are learning systems that have some control over what their inputs will be. A common form of active learning is that of "learning by queries," where a neural network "asks for" new training examples from some part of a domain based on its evaluation of of previous training examples (e.g. Cohn et al., Baum and Lang, Hwang et al. and MacKay). > At year's memory conference in Lancaster (England) Dave Rumelhart mentioned > the need to develop nets which incorporate a distinction between learning > and memory and exploit the attributes of both. ... The distinction usually made in querying is a bit different, and consists of a loop iterating between sampling and learning. With a neural network, the "learning" generally consists of simply training on the sampled data, and thus could be thought of as analogous to stuffing the data into "memory." The driving algorithm directs the sampling based on previously learned data to optimize the utility of new examples. This approach has been tried on a number of relatively complicated, yet still "toy" problems; current efforts are to overcome the representational and computational complexity problems that arise as one makes the transition to the domain of "real world" problems like speech. -David Cohn e-mail: pablo at cs.washington.edu Dept. of Computer Science & Eng. phone: (206) 543-7798 University of Washington, FR-35 Seattle, WA 98195 References: Cohn, Atlas & Ladner. (1990) Training Connectionist Networks with Queries and Selective Sampling. In D. Touretzky, ed., "Advances In Neural Info. Processing 2" Baum and Lang. (1991) Constructing Hidden Units using Examples and Queries. In Lippmann et al., eds., "Advances In Neural Info. Processing 3" Hwang, Choi, Oh and Marks. (1990) Query learning based on boundary search and gradient computation of trained multilayer perceptrons. In Proceedings, IJCNN 1990. MacKay. (1991) Bayesian methods for adaptive models. Ph.D. thesis, Dept. of Computation and Neural Systems, California Institute of Technology.  From pratt at cs.rutgers.edu Tue Mar 3 17:46:38 1992 From: pratt at cs.rutgers.edu (pratt@cs.rutgers.edu) Date: Tue, 3 Mar 92 17:46:38 EST Subject: Composite networks Message-ID: <9203032246.AA14834@binnacle.rutgers.edu> P.J. Hampson asks: >From STAY8026 at iruccvax.ucc.ie Mon Mar 2 09:45:00 1992 >Subject: Composite networks >Hi, > >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? >... Papers on how networks can be constructed modularly (source and target have different topologies, are responsible for different classes) include: [Waibel et al., IEEASSP], [Pratt & Kamm, IJCNN91], [Pratt et al., AAAI91], [Pratt, CLNL92]. The working title and abstract for my upcoming PhD thesis (to be described in [Pratt, 1992b]) are as follows: Transferring Previously Learned Back-Propagation Neural Networks to New Learning Tasks Lori Pratt Neural network learners traditionally extract most of their information from a set of training data. If training data is in short supply, the learned classifier may perform poorly. Although this problem can be addressed partly by carefully choosing network parameters, this process is ad hoc and requires expertise and manual intervention by a system designer. Several symbolic and neural network inductive learners have explored how a domain theory which supplements training data can be automatically incorporated into the training process to bias learning. However, research to date in both fields has largely ignored an important potential knowledge source: classifiers that have been trained previously on related tasks. If new classifiers were able to build directly on previous results, then training speed, performance, and the ability to effectively utilize small amounts of training data could potentially be substantially improved. This thesis introduces the problem of {\em transfer} of information from a trained learner to a new learning task. It also presents an algorithm for transfer between neural networks. Empirical results from several domains demonstrate that this algorithm can improve learning speed on a variety of tasks. This will be published in part as [Pratt, 1992]. --Lori -------------------------------------------------------------------------------- References: @article{ waibel-89b, MYKEY = " waibel-89b : .bap .unr .unb .tem .spc .con ", TITLE = "Modularity and Scaling in Large Phonemic Neural Networks", AUTHOR = "Alexander Waibel and Hidefumi Sawai and Kiyohiro Shikano", journal = "IEEE Transactions on Acoustics, Speech, and Signal Processing", VOLUME = 37, NUMBER = 12, MONTH = "December", YEAR = 1989, PAGES = {1888-1898} } @inproceedings{ pratt-91, MYKEY = " pratt-91 : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt and Jack Mostow and Candace A. Kamm", TITLE = "{Direct Transfer of Learned Information among Neural Networks}", BOOKTITLE = "Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91)", PAGES = {584--589}, ADDRESS = "Anaheim, CA", YEAR = 1991, } @inproceedings{ pratt-91b, MYKEY = " pratt-91b : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt and Candace A. Kamm", TITLE = "Improving a Phoneme Classification Neural Network through Problem Decomposition", YEAR = 1991, MONTH = "July", BOOKTITLE = "Proceedings of the International Joint Conference on Neural Networks (IJCNN-91)", ADDRESS = "Seattle, WA", PAGES = {821--826}, ORGANIZATION = "IEEE", } @incollection{ pratt-92, MYKEY = " pratt-92 : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt", TITLE = "Experiments on the Transfer of Knowledge Between Neural Networks", BOOKTITLE = "Computational Learning Theory and Natural Learning Systems, Constraints and Prospects", EDITOR = "S. Hanson and G. Drastal and R. Rivest", YEAR = 1992, PUBLISHER = "MIT Press", CHAPTER = "4.1", NOTE = "To appear", } @incollection{ pratt-92b, MYKEY = " pratt-92b : .min .bap .app .spc .con ", AUTHOR = "Lorien Y. Pratt", TITLE = "Non-literal information transfer between neural networks", BOOKTITLE = "Neural Networks: Theory and Applications {II}", EDITOR = "R.J.Mammone and Y. Y. Zeevie", YEAR = 1992, PUBLISHER = "Academic Press", NOTE = "To appear", } ------------------------------------------------------------------- L. Y. Pratt ,_~o Computer Science Department pratt at cs.rutgers.edu _-\_<, Rutgers University (*)/'(*) Hill Center (908) 932-4974 (CoRE building office) New Brunswick, NJ 08903, USA (908) 846-4766 (home)  From echown at engin.umich.edu Tue Mar 3 16:19:53 1992 From: echown at engin.umich.edu (Eric Chown) Date: Tue, 3 Mar 92 16:19:53 -0500 Subject: Article available Message-ID: <5756ff371.000b0f6@mtrans.engin.umich.edu> **DO NOT FORWARD TO OTHER GROUPS** The following paper has been added to the archive: Tracing Recurrent Activity in Cognitive Elements (TRACE): A Model of Temporal Dynamics in a Cell Assembly Stephen Kaplan, Martin Sonntag and Eric Chown Department of Electrical Engineering and Computer Science, and Department of Psychology, The University of Michigan The article appeared in Connection Science 3:179-206 (1991) Abstract: Hebb's introduction of the cell assembly concept marks the beginning of modern connectionism, yet its implications remail largely unexplored and its potential unexploited. Lately, however, promising efforts have been made to utilize recurrent connections, suggesting the timeliness of a reexamination of the cell assembly as a key element in a cognitive connectionism. Our approach emphasizes the psychological functions of activity in a cell assembly. This provides an opportunity to explore the dynamic behavior of the cell assembly considered as a continuous system, an important topic that we feel has not been given sufficient attention. A step-by-step analysis leads to an identification of characteristic temporal patterns and of necessary control systems. Each step of this analysis leads to a corresponding building block in a set of emerging equations. A series of experiments is then described that explore the implications of the theoretically derived equations in terms of the time course of activity generated by a simulation under different conditions. Finally the model is evaluated in terms of whether the various contraints deemed appropriate can be met, whether the resulting solution is robust, and whether the solution promises sufficient utility and generality. ------------------------------------------------------------------------------- This paper has been placed in the neuroprose directory in compressed form. The file name is kaplan.trace.ps.Z If your are unable to ftp this paper please address reprint requests to: Professor Stephen Kaplan Psychological Laboratories Mason Hall The University of Michigan Ann Arbor, MI 48109-1027 e-mail: Stephen_Kaplan at um.cc.umich.edu Here is how to ftp this paper: unix> ftp cheops.cis.ohio-state.edu (or 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get kaplan.trace.ps.Z ftp> quit unix> uncompress kaplan.trace.ps.Z unix> lpr kaplan.trace.ps  From inf21!finnoff at ztivax.uucp Wed Mar 4 06:48:14 1992 From: inf21!finnoff at ztivax.uucp (William Finnoff) Date: Wed, 4 Mar 92 12:48:14 +0100 Subject: BP-Stoc. Appr. Message-ID: <9203041148.AA07939@basalt> I'm looking for additional references on the relationship between pattern for pattern backpropagation and stochastic approximation/algorithms/optimization (Robins Munroe, etc.). I am aware of the following White, H., Some asymptotic results for learning in single hidden-layer feedforward network models, Jour. Amer. Stat. Ass. 84, no. 408, p. 1003-1013, (1989), White, H., Learning in artificial neural networks: A statistical perspective, Neural Computation 1, 1989, pp. 425-464, Darken C. and Moody J., Note on learning rate schedules for stochastic optimization, in Advances in Neural Information Processing Systems 3., Lippmann, R. Moody, J., and Touretzky, D., ed., Morgan Kaufmann, San Mateo, (1991), and the latest contribution of the last two authors at NIPS(91). I have a fairly good list of literature with regards to the general theory of stochastic algorithms. Three examples are listed below: Bouton C., Approximation Gaussienne d'algorithmes stochastiques a dynamique Markovienne. Thesis, Paris VI, (in French), (1985) Kushner, H.J., and Schwartz, A., An invariant measure approach to the convergence of stochastic approximations with state dependent noise. SIAM j. Control and Opt. 22, 1, p. 13-27, (1984) Metivier, M. and Priouret, P., Th'eor`emes de convergence presque-sure pour une classe d'algorithmes stochastiques `a pas d'ecroissant. Prob. Th. and Rel. Fields 74, p. 403-28, (in French), (1987). I am therefore only interested in references in which the relationship to BP is explicit. Any help in this matter will be appreciated.  From georg at ai.univie.ac.at Wed Mar 4 11:57:45 1992 From: georg at ai.univie.ac.at (Georg Dorffner) Date: Wed, 4 Mar 1992 17:57:45 +0100 Subject: Symposium announcement Message-ID: <199203041657.AA07770@chicago.ai.univie.ac.at> Announcement Symposium on CONNECTIONISM AND COGNITIVE PROCESSING as part of the Eleventh European Meeting on Cybernetics and Systems Research (EMCSR) April 21 - 24, 1992 University of Vienna, Austria Chairs: Noel Sharkey (Univ. of Exeter) Georg Dorffner (Univ. of Vienna) The following papers will be presented: Thursday afternoon (Apr. 23) Conflict Detection in Asynchronous Winner-Take-All Structures M.Deng, Penn State Univ., USA Weightless Neurons and Categorisation Modelling M.H.Gera, Univ. of London, UK Semantic Transitions in a Hierarchical Memory Network M.Herrmann, M.Usher, Tel Aviv Univ., ISR Type Generalisations on Distributed Representations D.Mundi, N.E.Sharkey, Univ. of Exeter, UK Non-Conceptual Content and Parallel Distributed Processing, a Match Made in Cognitive Science Heaven? R.L.Chrisley, Univ. of Oxford, UK Aspects of Rules and Connectionism E.Prem, Austrian Research Inst. for AI, A Friday Morning (April 24) Sub-symbolic Inference: Inferring Verb Meaning A.Baldwin, Univ. of Exeter, UK Mental Models in Connectionist Networks V.Ajjanagadde, Univ. of Tuebingen, D Connectionism and the Issue of Compositionality and Systematicity L.Niklasson, N.E.Sharkey, Univ. of Skoevde, S Friday Afternoon INVITED LECTURE: The Causal Role of the Constituents of Superpositional Representations N.E.Sharkey, Univ. of Exeter, UK followed by a moderated discussion with all previous presenters Section Neural Networks EMG/EEG Pattern Recognition by Neural Networks A.Hiraiwa, N.Uchida, K.Shimohara, NTT Human Interface Labs., J Simulation of Navigation of Mobile Robots with Non-Centralized Neuromorphic Control L.F.B. Almeida, E.P.L.Passos, Inst.Militar de Engenharia, BRA Weightless and Threshold-Controlled Neurocomputing O.Kufudaki, J.Horejs, Czechoslovak Acad.of Sciences, CS Neural Networks Learning with Genetic Algorithms P.M.Palagi, L.A.V. de Carvalho, Univ.Fed.do Rio de Janeiro, BRA - * - Among the plenary lectures of the conference, Tuesday morning will feature Fuzzy Logic, Neural Networks and Soft Computing L. Zadeh, UC Berkeley, USA Furthermore, the conference will include symposia on the following topics: - General Systems Methodology - Mathematical Systems Theory - Computer Aided Process Interpretation - Fuzzy Sets, Approximate Reasoning and Knowledge-Based Systems - Designing and Systems - Humanity, Architecture and Conceptualization - Biocybernetics and Mathematical Biology - Cybernetics in Medicine - Cybernetics of Socio-Economic Systems - Systems, Management and Organization - Cybernetics of National Development - Communication and Computers - Intelligent Autonomous Systems - Artificial Intelligence - Impacts of Artificial Intelligence Conference Fee: AS 2,400 for contributors, AS 3,400 for participants. (incl.proceedings (2 volumes) and two receptions; 12 AS = 1 US$ approx.). The proceedings will also be available from World Scientific Publishing Co., entitled "Cybernetics and Systems '92; R.Trappl (ed.)" Registration will be possible at the conference site (main building of the University of Vienna). You can also contact: EMCSR Conference Secretariat Austrian Society for Cybernetic Studies Schottengasse 3 A-1010 Vienna, Austria Tel: +43 1 535 32 810 Fax: +43 1 63 06 52 Email: sec at ai.univie.ac.at  From LC4A%ICINECA.BITNET at BITNET.CC.CMU.EDU Wed Mar 4 14:19:46 1992 From: LC4A%ICINECA.BITNET at BITNET.CC.CMU.EDU (F. Ventriglia) Date: Wed, 04 Mar 92 14:19:46 SET Subject: Neural Networks School Message-ID: <01GH8N7EHME8CTZB4P@BITNET.CC.CMU.EDU> Dear Sir, I would like to inform you that I am organizing an International School on Neural Modeling and Neural Networks, as moderator of this E-mail Network could you propagate the notice of this School among the subscribers of your network? Could you insert my name in the list of subscribers? Looking forward to hearing from you. My best wishes and regards. Francesco Ventriglia ================ INTERNATIONAL SCHOOL on NEURAL MODELLING and NEURAL NETWORKS Capri (Italy) - September 27th-October 9th, 1992 Director F. Ventriglia An International School on Neural Modelling and Neural Networks was organized under the sponsorship of the Italian Group of Cybernetics and Biophysics of the CNR and of the Institute of Cybernetics of the CNR; sponsor the American Society for Mathematical Biology. The purpose of the school is to give to young scientists and to migrating senior scientists some landmarks in the inflationary universe of researches in neural modelling and neural networks. Towards this aim some well known experts will give lectures in different areas comprising neural structures and functions, single neuron dynamics, oscillations in small group of neurons, statistical neurodynamics of neural networks, learning and memory. In the first part, some neurobiological foundations and some formal models of single (or small groups of) neurons will be stated. The topics will be: TOPICS LECTURERS 1. Neural Structures * Szentagothai, Budapest 2. Correlations in Neural Activity * Abeles, Jerusalem 3. Single Neuron Dynamics: deterministic models * Rinzel, Bethesda 4. Single Neuron Dynamics: stochastic models * Ricciardi, Naples 5. Oscillations in Neural Systems * Ermentrout, Pittsburgh 6. Noise in Neural Systems * Erdi, Budapest The second part will be devoted to Neural Networks, i.e. to models of neural systems and of learning and memory. The topics will be: TOPICS LECTURERS 7. Mass action in Neural Systems * Freeman, Berkeley 8. Statistical Neurodynamics: kinetic approach * Ventriglia, Naples 9. Statistical Neurodynamics: sigmoidal approach * Cowan, Chicago 10.Attractor Neural Networks in Cortical Conditions * Amit, Roma 11."Real" Neural Network Models * Traub, Yorktown Heights 12.Pattern Recognition in Neural Networks * Fukushima, Osaka 13.Learning in Neural Networks * Tesauro, Yorktown Heights WHO SHOULD ATTEND Applicants for the international School should be actively engaged in the fields of biological cybernetics, biomathematics or computer science, and have a good background in mathematics. As the number of participants must be limited to 70, preference may be given to students who are specializing in neural modelling and neural networks and to professionals wha are seeking new materials for biomathematics or computer science courses. SCHOOL FEES The school fee is Italian Lire 500.000 and includes notes, lunch and coffee- break for the duration of the School. REGISTRATION A limited number of grants (covering the registration fee of Lit. 500.000) is available. The organizator applied to the Society of Mathematical Biology for travel funds for participants who are member of the SMB. Preference will be given to students, postdoctoral fellows and young faculty (1-2 years) after PhD. PROCEDURE FOR APPLICATION Applicants should contact: Dr. F. Ventriglia Registration Capri International School Istituto di Cibernetica Via Toiano 6 80072 - Arco Felice (NA) Italy Tel. (39-) 81-8534 138 E-Mail LC4A at ICINECA (bitnet) Fax (39-) 81-5267 654 Tx 710483  From patkins at laurel.ocs.mq.edu.au Thu Mar 5 12:09:48 1992 From: patkins at laurel.ocs.mq.edu.au (Paul Atkins) Date: Thu, 5 Mar 92 12:09:48 EST Subject: Ability to generalise over multiple training runs Message-ID: <9203050209.AA16943@laurel.ocs.mq.edu.au> In their recent technical report, Bates & Elman note that "it is possible for several different networks to reach the same solution to a problem, each with a totally different set of weights." (p. 13b) I am interested in the relationship between this phenomenon and the measurement of a network's ability to generalise. From singh at envy.cs.umass.edu Wed Mar 4 10:33:27 1992 From: singh at envy.cs.umass.edu (singh@envy.cs.umass.edu) Date: Wed, 4 Mar 92 10:33:27 -0500 Subject: Composite Networks Message-ID: <9203041533.AA11969@gluttony.cs.umass.edu> Hi! ***This is in repsonse to P. J. Hampson's message about composite networks.*** >From STAY8026 at iruccvax.ucc.ie Mon Mar 2 09:45:00 1992 >Subject: Composite networks >Hi, > >I am interested in modelling tasks in which invariant information from >previous input-output pairs is brought to bear on the acquisition of current >input-output pairs. Thus I want to use previously extracted regularity to >influence current processing. Does anyone think this is feasible?? >... I have studied learning agents that have to learn to solve MULTIPLE sequential decision tasks (SDTs) in the same external environment. Specifically, I have looked at reinforcement learning agents that have to solve a set of compositionally-structured sequential decision tasks. E.g., consider a navigation environment ( a robot in a room): Task 1: Go to location A optimally. Task 2: Go to location B optimally. Task 3: Go to location A and then to B optimally. Tasks 1 and 2 are 'elemental' SDTs and Task 3 is a 'composite' SDT. I have studied two different ways of achieving the obvious kind of TRANSFER of LEARNING across such a set of tasks. I am going to (try and) be brief - anyone interested in further discussion or my papers can contact me individually. Method 1: ********* I have used a modified Jacobs-Jordan-Nowlan-Hinton ``mixture of expert modules'' network with Watkin's Q-learning algorithm to construct a mixture of ``adaptive critics'' that learns the elemental tasks in separate modules and then the gating module learns to sequence the correct elemental modules to solve the composite tasks. Note, that the representation of the tasks is not ``linguistic'' and therefore the agent cannot simply ``parse'' the composite task representation to determine which elemental modules to sequence. The decomposition has to be discovered by trial-and-error. Transfer of learning is achieved by sharing the solution of previously acquired elemental tasks across multiple composite tasks. Sequential decision tasks are particularly difficult to learn to solve because there is no supervised target information, only a success/failure reponse at the end of the task. Ref: --- @InProceedings{Singh-NIPS4, author = "Singh,S.P.", title = "On the efficient learning of multiple sequential tasks", booktitle = "Advances in Neural Information Processing Systems 4", year = "1992", editor = "J.E. Moody and S.J. Hanson and R.P. Lippman", OPTpages = "", OPTorganization = "", publisher = "Morgan Kauffman", address = "San Mateo, CA", OPTmonth = "", OPTnote = "Oral"} @Article{Singh-MLjournal, author = "Singh,S.P.", title = "Transfer of Learning by Composing Solutions for Elemental Sequential Tasks", journal = "Machine Learning", year = "1992", OPTvolume = "", OPTnumber = "", OPTpages = "", OPTmonth = "", OPTnote = "to appear"} @phdthesis{Watkins-thesis, author="C. J. C. H. Watkins", title="Learning from Delayed Rewards", school="Cambridge Univ.", address="Cambridge, England", year=1989} @Article{Jacobs-Jordan-Nowlan-Hinton, author = "R. A. Jacobs and M. I. Jordan and S. J. Nowlan and G. E. Hinton", title = "Adaptive Mixtures of Local Experts", journal = "Neural Computation", year = "1991", volume = "3", number = "1", OPTpages = "", OPTmonth = "", OPTnote = "" } Method 2: ********* Method 1 did not learn models of the environment. For learning to solve a single SDT it is not always clear that the considerable expense of doing system identification is warranted (Barto and Singh, Gullapalli), however if an agent is going to solve multiple tasks in the same environment it is almost certainly going to be useful. I consider a hierarchy of world-models, where the ``actions/operators'' for upper level models are the policies for tasks lower in the hierarchy. I prove that for compositionally-structured tasks, doing Dynamic Programming in such upper level models leads to the same solutions as doing it in the real world - only it is much faster since the actions of the upper level world models ignore much TEMPORAL DETAIL. Ref: **** @InProceedings{Singh-AAAI92, author = "Singh, S.P.", title = "Reinforcement learning with a hierarchy of abstract models", booktitle = "Proceedings of the Tenth National Conference on Artificial Intelligence", year = "1992", OPTeditor = "", OPTpages = "", OPTorganization = "", OPTpublisher = "", address = "San Jose,CA", OPTmonth = "", note = "Forthcoming" } @InProceedings{Singh-ML92, author = "Singh, S.P.", title = "Scaling reinforcement learning algorithms by learning variable temporal resolution models", booktitle = "Proceedings of the Machine Learning Conference, 1992", year = "1992", OPTeditor = "", OPTpages = "", OPTorganization = "", OPTpublisher = "", OPTaddress = "", OPTmonth = "", OPTnote = "to appear" } @inproceedings{Barto-Singh, title="On the Computational Economics of Reinforcement Learning", author="Barto, A.G. and Singh,S.P.", booktitle="Proceedings of the 1990 Connectionist Models Summer School", year="Nov. 1990", address="San Mateo, CA", editors="Touretzsky, D.S. and Elman, J.L. and Sejnowski, T.J. and Hinton, G.E.", publisher="Morgan Kaufmann", status="Read"} @incollection{Gullapalli, author="V. Gullapalli", title="A Comparison of Supervised and Reinforcement Learning Methods on a Reinforcement Learning Task", booktitle = "Proceedings of the 1991 {IEEE} Symposium on Intelligent Control", address="Arlington, VA", year="1991"} satinder. satinder at cs.umass.edu  From robtag at udsab.dia.unisa.it Wed Mar 4 09:59:26 1992 From: robtag at udsab.dia.unisa.it (Tagliaferri Roberto) Date: Wed, 4 Mar 92 15:59:26 +0100 Subject: Post-doc Positions at IIASS Message-ID: <9203041459.AA27538@udsab.dia.unisa.it> I.I.A.S.S. International Institute for Advanced Scientific Studies Vietri sul mare (Salerno) Italy The IIASS's main research interests lie in the areas of Neural Networks, Machine Learning, Computer Science and Theoretical Physics. The Institute works in close cooperation with the Departments of Computer Science and Theoretical Physics of the nearby University of Salerno. It calls for applicants for the following POSTDOCTORAL RESEARCH POSITIONS 1. Theory and Applications of Hybrid Systems. 2. Analysis of Monodimensional Signals, mainly ECG and Continuous Speech with Neural Nets. 3. Pattern Recognition and Machine Vision. 4. Machine Learning and Robotics. Salary will be according to Italian standards. The candidates should possess a Ph.D. or, at least, four years of documented scientific research activity in the indicated areas. Age limit: 35 years. Applications should include the following elements: - curriculum vitae: academic career, last position held, detailed documentation of scientific activity. - 2 letters of presentation. - Specification of present and proposed research activity of the applicant. Applications must be addressed to the President of IIASS Professor E.R. Caianiello IIASS via G. Pellegrino, 19 I-84019 Vietri sul mare (Salerno) Italy Deadline: May 31, 1992. For informations contact: Dr. Roberto Tagliaferri Dept. Informatica ed Applicazioni Universita' di Salerno I-84081 Baronissi (Salerno) Italy tel. +39 89 822263 fax. +39 89 822275\2 e-mail address: robtag at udsab.dia.unisa.it  From tenorio at ecn.purdue.edu Thu Mar 5 17:00:40 1992 From: tenorio at ecn.purdue.edu (tenorio@ecn.purdue.edu) Date: Thu, 5 Mar 1992 16:00:40 -0600 Subject: Ability to generalise over multiple training runs Message-ID: <9203052051.AA01286@dynamo.ecn.purdue.edu> >In their recent technical report, Bates & Elman note that "it is possible >for several different networks to reach the same solution to a problem, each >with a totally different set of weights." (p. 13b) I am interested in the >relationship between this phenomenon and the measurement of a network's >ability to generalise. I have not seen the report yet, and I don't understand the assumptions made to get at this conclusion, but if the solution is defined as some quality of approximation of a finite number of points in the training or testing set, then I contend that there are a very large, possibly infinite networks that would yield to the same "solution." This argument is true if the networks are made with different architectures, or if a single architecture is chosen; if it is a classification or interpolation; and if the weights are allowed to be real valued or not. A simple modification on the input variable order, or the presentation order, or the functions of the nodes, or the initial points, or the number of hidden nodes would lead to different nets. The only way to talk about the "optimum weights" (for a fixed architecture in all respects) is if the function is defined in EVERY possible point. For classification tasks for example, how many ways can a closed contour be defined with hyperplanes? Or in interpolation, how many functions perfectly visit the data points, yet can do wildly different things in the "don't care" points? Therefore, a function defined by a finite number of points can be represented by an equivalent family of functions within an epsilon of error, regardless of how big the finite set is. > >>From the above I presume (possibly incorrectly) that, if there are many >possible solutions, then some of them will work well for new inputs and >others will not work well. So on one training run a network may appear to >generalise well to a new input set, while on another it does not. Does this >mean that, when connectionists refer to the ability of a network to >generalise, they are referring to an average ability over many trials? Has >anyone encountered situations in which the same network appeared to >generalise well on one learning trial and poorly on another? > This issue has come up in the network about a couple of weeks ago in a discussion about regularization and network training. it has to do with the power to express the function given the network (has the network more or less degrees of freedom that needed) and a limited number of points and the fact that these points can be noisy, and a poor representation of the function itself. Things can get even more hectic if the training set is not a faithful representation of the distribution of the function because of the way it was designed. I'll let the people that published reports on these and contributed on the discussion contact you directly with their views. >Reference: >Bates, E.A. & Elman, J.L. (1992). Connectionism and the study > of change. CRL Technical Report 9202, (February). > >-- >Paul Atkins email: patkins at laurel.mqcc.mq.oz.au >School of Behavioural Sciences phone: (02) 805-8606 >Macquarie University fax : (02) 805-8062 >North Ryde, NSW, 2113 >Australia. < Manoel Fernando Tenorio > < (tenorio at ecn.purdue.edu) or (..!pur-ee!tenorio) > < MSEE233D > < Parallel Distributed Structures Laboratory > < School of Electrical Engineering > < Purdue University > < W. Lafayette, IN, 47907 > < Phone: 317-494-3482 Fax: 317-494-6440 >  From cherwig at eng.clemson.edu Sun Mar 8 03:19:24 1992 From: cherwig at eng.clemson.edu (christoph bruno herwig) Date: Sun, 8 Mar 92 03:19:24 EST Subject: linear separability Message-ID: <9203080819.AA27836@eng.clemson.edu> I was wondering if someone could point me in the right direction concerning the following fundamental separability problem: Given a binary (-1/1) valued training set consisting of n-dimensional input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias term as the n-th dimension) and 1-dimensional target vectors. For this 2-class classification problem I wish to prove (non-) linear separability solely on the basis of the given training set (hence determine if the underlying problem may be solved with a 2-layer feedforward network). My approach so far: Denote input vectors of the first (second) class with H1 (H2). We need to find a hyperplane in n-dimesional space separating vectors of H1 and H2. The hyperplane goes through the origin of the n-dimensional hypercube. Negate the vectors of H2 and the problem now states: Find the hyperplane with all vectors (from H1 and the negated of H2) on one side of the plane and none on the other. Pick one of the vectors (= a vertex of the hypercube) and compute the Hamming Distance to all other vectors. Clearly, there must be a constraint on how many of the vectors are allowed to be how far "away" in order for the hyperplane to be able to separate. E.g.: If any of the Hamming Distances would be n, then the training set would not be linearly separable. My problem is concerning the constraints for Distances of n-1, n-2, etc... Has anyone taken a similar approach? Are there alternate solutions to linear separability? I tried linear programming, but the simplex algorithm uses a prohibitive number of variables for a mid-size problem. Your help is greatly appreciated, Christoph.  From alexis at CS.UCLA.EDU Sun Mar 8 23:45:38 1992 From: alexis at CS.UCLA.EDU (Alexis Wieland) Date: Sun, 8 Mar 92 20:45:38 -0800 Subject: linear separability In-Reply-To: christoph bruno herwig's message of Sun, 8 Mar 92 03:19:24 EST <9203080819.AA27836@eng.clemson.edu> Message-ID: <9203090445.AA06949@oahu.cs.ucla.edu> > From: christoph bruno herwig > Given a binary (-1/1) valued training set consisting of n-dimensional > input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias > term as the n-th dimension) and 1-dimensional target vectors. For this 2-class > classification problem I wish to prove (non-) linear separability > solely on the basis of the given training set (hence determine if > the underlying problem may be solved with a 2-layer feedforward network). > My approach so far: > Denote input vectors of the first (second) class with H1 (H2). We need > to find a hyperplane in n-dimesional space separating vectors of H1 > and H2. The hyperplane goes through the origin of the n-dimensional > hypercube. ... Your analysis relies on the dividing hyperplane passing through the origin, a condition that you dutifully state. But this need not be the case for linearly separable problems. Consider the simple 2D case with the four points (1,1), (1,-1), (-1,1), (-1,-1). Place one point in H1 and the rest in H2. The problem is clearly linearly separable, but there is no line that passes through the origin that will serve. The min distance from the origin to the dividing hyperplane for a class with one element increases with dimension of the input. - alexis. (Mr. Spiral :-)  From dlovell at s1.elec.uq.oz.au Mon Mar 9 18:52:48 1992 From: dlovell at s1.elec.uq.oz.au (David Lovell) Date: Mon, 9 Mar 92 18:52:48 EST Subject: linear separability Message-ID: <9203090853.AA20175@c10.elec.uq.oz.au> Sorry this is going out to everyone, I couldn't find a useable email address for Christoph. # #Given a binary (-1/1) valued training set consisting of n-dimensional #input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias #term as the n-th dimension) and 1-dimensional target vectors. For this 2-class #classification problem I wish to prove (non-) linear separability #solely on the basis of the given training set (hence determine if #the underlying problem may be solved with a 2-layer feedforward network). Actually, you only need one neuron to seperate that set of vectors (if, of course, they are linearly seperable). # #My approach so far: #Denote input vectors of the first (second) class with H1 (H2). We need #to find a hyperplane in n-dimesional space separating vectors of H1 #and H2. The hyperplane goes through the origin of the n-dimensional #hypercube. Negate the vectors of H2 and the problem now states: Find the #hyperplane with all vectors (from H1 and the negated of H2) on one side #of the plane and none on the other. Pick one of the vectors (= a vertex of #the hypercube) and compute the Hamming Distance to all other vectors. #Clearly, there must be a constraint on how many of the vectors are #allowed to be how far "away" in order for the hyperplane to be able to #separate. E.g.: If any of the Hamming Distances would be n, then the #training set would not be linearly separable. #My problem is concerning the constraints for Distances of n-1, n-2, #etc... # #Has anyone taken a similar approach? A distressingly large number of people in the 60's actually. (I've been beating my head against that problem for the past 6 months). #Are there alternate solutions to #linear separability? I tried linear programming, but the simplex #algorithm uses a prohibitive number of variables for a mid-size #problem. # There is a fundamental result concerning the sums of all TRUE vectors and all FALSE vectors, but it doesn't work with only a sampling of those vectors. Here come a list of papers that deal with the problem as best as they can: W.H. Highleyman, "A note on linear separation", IRE Trans. Electronic Computers, EC-10, 777-778, 1961. R.C. Singleton, "A test for linear separability applied to self-organizing machines," in Self-Organizing Systems, M.C. Yovitts, Ed., 503-524, 1961. The Perceptron Convergence Procedure, Ho-Kayshap procedure and TLU synthesis techniques by Dertouzos or Kaszerman will not converge in the non-LS case, see: Perceptron Convergence Procedure R.O.Duda & P.E. Hart, " Pattern Classification and Scene Analysis," New York, John Wiley and Sons, 1973. Ho-Kayshap ibid TLU Dertouzos M.L. Dertouzos, "Threshold Logic: A synthesis Approach", MIT Research Monograph), vol 32. Cambridge, MA, MIT Press, 1965. TLU Kaszerman P. Kaszerman, "A geometric Test Synthesis Procedure for a threshold Device", Information and Control, 6, 381-393, 1963. I'm not sure, but I think those last two require the functions to be completely specified. The approach that I was looking at is as follows. What we are really interested in is the set of between n and n Choose Floor(n/2) vectors that closest to the separating hyperplane (if there is one). This is the union of the minimally TRUE and maximally FALSE vectors. If we can show that this set of vectors lies on one side of some hyperplane then the problem is solved. The reason that the problem is not solved is that I haven't yet been able to find a method for determining these vectors without having the problem completely specified! #Your help is greatly appreciated, # I could use a bit of help too. Anyone interested? Please mail me if you want or can supply more info. Don't forget to send your email address so that we can continue our discussion in private if it does not prove to be of general interest to the readership of connectionists. Happy connectionisming. -- David Lovell - dlovell at s1.elec.uq.oz.au | | Dept. Electrical Engineering | "Oh bother! The pudding is ruined University of Queensland | completely now!" said Marjory, as BRISBANE 4072 | Henry the daschund leapt up and Australia | into the lemon surprise. | tel: (07) 365 3564 |  From sankar at mbeya.research.att.com Mon Mar 9 09:34:29 1992 From: sankar at mbeya.research.att.com (ananth sankar) Date: Mon, 9 Mar 92 09:34:29 EST Subject: linear separability Message-ID: <9203091434.AA28244@klee.research.att.com> Alexis says with regard to Christoph's message... >Your analysis relies on the dividing hyperplane passing through the >origin, a condition that you dutifully state. But this need not be >the case for linearly separable problems. Consider the simple 2D case >with the four points (1,1), (1,-1), (-1,1), (-1,-1). Place one point >in H1 and the rest in H2. The problem is clearly linearly separable, >but there is no line that passes through the origin that will serve. Christoph also had stated that his n-dimensional input vector consisted of n-1 inputs and a constant input of 1. Thus the search for a solution for "a" and "b" so that a.x > b for all x is now transformed to solving a.x - b > 0 for all x, where (a,b) is a hyperplane passing thru the origin. Your example above has a hyperplane thru origin in 3-space solution if you add an additional input of 1 to each vector. --Ananth  From UBTY003 at cu.bbk.ac.uk Mon Mar 9 07:12:00 1992 From: UBTY003 at cu.bbk.ac.uk (Martin Davies) Date: Mon, 9 Mar 92 12:12 GMT Subject: European Society for Philosophy and Psychology Message-ID: ****** EUROPEAN SOCIETY FOR PHILOSOPHY AND PSYCHOLOGY ****** *********** INAUGURAL CONFERENCE *********** **** 17 - 19 JULY, 1992 **** The Inaugural Conference of the European Society for Philosophy and Psychology will take place in Louvain (Leuven) Belgium, from Friday 17 to Sunday 19 July, 1992. The goal of the Society is 'to promote interaction between philosophers and psychologists on issues of common concern'. The programme for this inaugural meeting will comprise invited lectures - by Dan Sperber and Larry Weiskrantz - and invited symposia. Topics for symposia include: Intentionality, Reasoning, Connectionist Models, Consciousness, Theory of Mind, and Philosophical Issues from Linguistics. There will also be a business meeting to inaugurate the Society formally. The conference will be held in the Institute of Philosophy, University of Louvain. The first session will commence at 3.00 pm on Friday 17 July, and the conference will end at lunchtime on Sunday 19 July. Accommodation at various prices in hotels and student residences will be available. To receive further information about registration and accommodation, along with programme details, please contact one of the following: Daniel Andler CREA 1 rue Descartes 75005 Paris France Email: azra at poly.polytechnique.fr Martin Davies Philosophy Department Birkbeck College Malet Street London WC1E 7HX England Email: ubty003 at cu.bbk.ac.uk Beatrice de Gelder Psychology Department Tilburg University P.O. Box 90153 5000 LE Tilburg Netherlands Email: beadegelder at kub.nl Tony Marcel MRC Applied Psychology Unit 15 Chaucer Road Cambridge CB2 2EF England Email: tonym at mrc-apu.cam.ac.uk ****************************************************************  From geiger at medusa.siemens.com Mon Mar 9 12:26:32 1992 From: geiger at medusa.siemens.com (Davi Geiger) Date: Mon, 9 Mar 92 12:26:32 EST Subject: Call for Workshops Message-ID: <9203091726.AA06684@medusa.siemens.com> CALL FOR WORKSHOPS NIPS*92 Post-Conference Workshops December 4 and 5, 1992 Vail, Colorado Request for Proposals Following the regular NIPS program, workshops on current topics in Neural Information Processing will be held on December 4 and 5, 1992, in Vail, Colorado. Proposals by qualified individuals interested in chairing one of these workshops are solicited. Past topics have included: Computational Neuroscience; Sensory Biophysics; Recurrent Nets; Self-Organization; Speech; Vision; Rules and Connectionist Models; Neural Network Dynamics; Computa- tional Complexity Issues; Benchmarking Neural Network Applica- tions; Architectural Issues; Fast Training Techniques; Active Learning and Control; Optimization; Bayesian Analysis; Genetic Algorithms; VLSI and Optical Implementations; Integration of Neural Networks with Conventional Software. The goal of the workshops is to provide an informal forum for researchers to freely discuss important issues of current interest. Sessions will meet in the morning and in the afternoon of both days, with free time in between for ongoing individual exchange or outdoor activities. Specific open and/or controversial issues are en- couraged and preferred as workshop topics. Individuals proposing to chair a workshop will have responsibilities including: arrange brief informal presentations by experts working on the topic, moderate or lead the discussion, and report its high points, findings and conclusions to the group during evening plenary ses- sions, and in a short (2 page) written summary. Submission Pro- cedure: Interested parties should submit a short proposal for a workshop of interest postmarked by May 22, 1992. (Express mail is *not* necessary. Submissions by electronic mail will also be acceptable.) Proposals should include a title, a short descrip- tion of what the workshop is to address and accomplish, and the proposed length of the workshop (one day or two days). It should state why the topic is of interest or controversial, why it should be discussed and what the targeted group of participants is. In addition, please send a brief resume of the prospective workshop chair, a list of publications and evidence of scholar- ship in the field of interest. _M_a_i_l _s_u_b_m_i_s_s_i_o_n_s _t_o: Dr. Gerald Tesauro NIPS*92 Workshops Chair IBM T. J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA (e-mail: tesauro at watson.ibm.com) Name, mailing address, phone number, and e-mail net address (if applicable) must be on all submissions. PROPOSALS MUST BE POSTMARKED BY MAY 22, 1992 Please Post  From giles at research.nj.nec.com Tue Mar 10 13:21:49 1992 From: giles at research.nj.nec.com (Lee Giles) Date: Tue, 10 Mar 92 13:21:49 EST Subject: Ability to generalise over multiple training runs Message-ID: <9203101821.AA08904@fuzzy.nj.nec.com> Regarding recent discussions on training different nets and their ability to get the same solution: We observed (Giles, et.al., in IJCNN91, NIPS4, & Neural Computation 92) similar results for recurrent nets learning small regular grammars (finite state automata) from positive and negative sample strings. Briefly, the characters of each string are presented at each time step and supervised training occurs at the end of string presentation (RTRL). [See the above papers for more information] Using random initial weight conditions and different numbers of neurons, most trained neural networks perfectly classified the training sets. Using a heuristic extraction method (there are many similar methods), a grammar could be extracted from the trained neural network. These extracted grammars were all different, but could be reduced to a unique "minimal number of states" grammar (or minimal finite state automaton). Though these experiments were for 2nd order fully recurrent nets, we've extracted the same grammars from 1st order recurrent nets using the same training data. Not all machines performed as well on unseen strings. Some were perfect on all strings tested; others weren't. For small grammars, nearly all of the trained neural networks produced perfect extracted grammars. In most cases the nets were trained on 10**3 strings and tested on randomly chosen 10**6 strings whose string length is < 99. (Since an arbitrary number of strings can be generated by these grammars, perfect generalization is not possible to test in practice.) In fact it was possible to extract ideal grammars from the trained nets that classified fairly well, but not perfectly, on the test set. [In other words, you could throw away the net and use just the grammar.} This agrees with Paul Atkins' comment: >From the above I presume (possibly incorrectly) that, if there are many >possible solutions, then some of them will work well for new inputs and >others will not work well. and with Manoel Fernando Tenorio's observation: >...then I contend that there are a very large, possibly infinite networks >architectures, or if a single architecture is chosen; if it is a >classification or interpolation; and if the weights are allowed to be real >valued or not. A simple modification on the input variable order, or the >presentation order, or the functions of the nodes, or the initial points, >or the number of hidden nodes would lead to different nets... C. Lee Giles NEC Research Institute 4 Independence Way Princeton, NJ 08540 USA Internet: giles at research.nj.nec.com UUCP: princeton!nec!giles PHONE: (609) 951-2642 FAX: (609) 951-2482  From rsun at orion.ssdc.honeywell.com Tue Mar 10 16:03:05 1992 From: rsun at orion.ssdc.honeywell.com (Ron Sun) Date: Tue, 10 Mar 92 15:03:05 CST Subject: No subject Message-ID: <9203102103.AA17546@orion.ssdc.honeywell.com> Paper announcement: ------------------------------------------------------------------ Beyond Associative Memories: Logics and Variables in Connectionist Models Ron Sun Honeywell SSDC 3660 Technology Drive Minneapolis, MN 55418 abstract This paper demonstrates the role of connectionist (neural network) models in reasoning beyond that of an associative memory. First we show that there is a connection between propositional logics and the weighted-sum computation customarily used in connectionist models. Specifically, the weighted-sum computation can handle Horn clause logic and Shoham's logic as special cases. Secondly, we show how variables can be incorporated into connectionist models to enhance their representational power. We devise solutions to the connectionist variable binding problem to enable connectioninst networks to handle variables and dynamic bindings in reasoning. A new model, the Discrete Neuron formalism, is employed for dealing with the variable binding problem, which is an extension of the weighted-sum models. Formal definitions are presented, and examples are analyzed in details. To appear in: Information Sciences, special issues on neural nets and AI It is FTPable from archive.cis.ohio-state.edu in: pub/neuroprose No hardcopy available. FTP procedure: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get sun.beyond.ps.Z ftp> quit unix> uncompress sun.beyond.ps.Z unix> lpr sun.beyond.ps (or however you print postscript)  From dlovell at s1.elec.uq.oz.au Thu Mar 12 10:48:29 1992 From: dlovell at s1.elec.uq.oz.au (David Lovell) Date: Thu, 12 Mar 92 10:48:29 EST Subject: Paper on Neocognitron Training avail on neuroprose Message-ID: <9203120048.AA02305@c10.elec.uq.oz.au> The following comment (3 pages in length) has been placed in the Neuroprose archive and submitted to IEEE Transactions on Neural Networks. Any comments or questions (both of which are invited) should be addressed to the first author: dlovell at s1.elec.uq.oz.au Thanks must go to Jordan Pollack for maintaining this excellent service. Apologies if this is the second copy of this notice to reach your site but there were problems in mailing the original (ie. I don't think it got through). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% A NOTE ON A CLOSED-FORM TRAINING ALGORITHM FOR THE NEOCOGNITRON David Lovell, Ah Chung Tsoi & Tom Downs Intelligent Machines Laboratory, Department of Electrical Engineering University of Queensland, Queensland 4072, Australia In this note, a difficulty with the application of Hildebrandt's closed-form training algorithm for the neocognitron is reported. In applying this algorithm we have observed that S-cells frequently fail to respond to features that they have been trained to extract. We present results which indicate that this training vector rejection in an important factor in the overall classification performance of the neocognitron trained using Hildebrandt's procedure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% filename: lovell.closed-form.ps.Z FTP INSTRUCTIONS unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: anything ftp> cd pub/neuroprose ftp> binary ftp> get lovell.closed-form.ps.Z ftp> bye unix% zcat lovell.closed-form.ps.Z | lpr (or whatever *you* do to print a compressed PostScript file) ---------------------------------------------------------------- David Lovell - dlovell at s1.elec.uq.oz.au | | Dept. Electrical Engineering | University of Queensland | BRISBANE 4072 | Australia | | tel: (07) 365 3564 |  From smieja at jargon.gmd.de Wed Mar 11 12:13:57 1992 From: smieja at jargon.gmd.de (Frank Smieja) Date: Wed, 11 Mar 92 18:13:57 +0100 Subject: TR--Reflective Neural Network Architecture Message-ID: <9203111713.AA18184@jargon.gmd.de> The following paper has been placed in the Neuroprose archive. ******************************************************************* REFLECTIVE MODULAR NEURAL NETWORK SYSTEMS F. J. Smieja and H. Muehlenbein German National Research Centre for Computer Science (GMD) Schlo{\ss} Birlinghoven, 5205 St. Augustin 1, Germany. ABSTRACT Many of the current artificial neural network systems have serious limitations, concerning accessibility, flexibility, scaling and reliability. In order to go some way to removing these we suggest a {\it reflective neural network architecture}. In such an architecture, the modular structure is the most important element. The building-block elements are called ``\MINOS'' modules. They perform {\it self-observation\/} and inform on the current level of development, or scope of expertise, within the module. A {\it Pandemonium\/} system integrates such submodules so that they work together to handle mapping tasks. Network complexity limitations are attacked in this way with the Pandemonium problem decomposition paradigm, and both static and dynamic unreliability of the whole Pandemonium system is effectively eliminated through the generation and interpretation of {\it confidence\/} and {\it ambiguity\/} measures at every moment during the development of the system. Two problem domains are used to test and demonstrate various aspects of our architecture. {\it Reliability\/} and {\it quality\/} measures are defined for systems that only answer part of the time. Our system achieves better quality values than single networks of larger size for a handwritten digit problem. When both second and third best answers are accepted, our system is left with only 5\% error on the test set, 2.1\% better than the best single net. It is also shown how the system can elegantly learn to handle garbage patterns. With the parity problem it is demonstrated how complexity of problems may be decomposed automatically by the system, through solving it with networks of size smaller than a single net is required to be. Even when the system does not find a solution to the parity problem, because networks of too small a size are used, the reliability remains around 99--100\%. Our Pandemonium architecture gives more power and flexibility to the higher levels of a large hybrid system than a single net system can, offering useful information for higher-level feedback loops, through which reliability of answers may be intelligently traded for less reliable but important ``intuitional'' answers. In providing weighted alternatives and possible generalizations, this architecture gives the best possible service to the larger system of which it will form part. Keywords: Reflective architecture, Pandemonium, task decomposition, confidence, reliability. ******************************************************************** ---------------------------------------------------------------- FTP INSTRUCTIONS unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get smieja.reflect.ps.Z ftp> bye unix% zcat smieja.reflect.ps.Z | lpr (or whatever *you* do to print a compressed PostScript file) ---------------------------------------------------------------- -Frank Smieja  From holm at nordita.dk Thu Mar 12 05:47:29 1992 From: holm at nordita.dk (Holm Schwarze) Date: Thu, 12 Mar 92 11:47:29 +0100 Subject: No subject Message-ID: <9203121047.AA07386@norsci0.nordita.dk> ** DO NOT FORWARD TO OTHER GROUPS ** The following paper has been placed in the Neuroprose archive in file schwarze.gentree.ps.Z . Retrieval instructions follow the abstract. Hardcopies are not available. -- Holm Schwarze (holm at nordita.dk) ------------------------------------------------------------------------- GENERALIZATION IN A LARGE COMMITTEE MACHINE H. Schwarze and J. Hertz CONNECT, The Niels Bohr Institute and Nordita Blegdamsvej 17, DK-2100 Copenhagen, Denmark ABSTRACT We study generalization in a committee machine with non--overlapping receptive fields trained to implement a function of the same structure. Using the replica method, we calculate the generalization error in the limit of a large number of hidden units. For continuous weights the generalization error falls off asymptotically inversely proportional to alpha, the number of training examples per weight. For binary weights we find a discontinuous transition from poor to perfect generalization followed by a wide region of metastability. Broken replica symmetry is found within this region at low temperatures. The first--order transition occurs at a lower and the metastability limit at a higher value of alpha than in the simple perceptron. ------------------------------------------------------------------------- To retrieve the paper by anonymous ftp: unix> ftp archive.cis.ohio-state.edu # (128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get schwarze.gentree.ps.Z ftp> quit unix> uncompress schwarze.gentree.ps.Z unix> lpr -P schwarze.gentree.ps ------------------------------------------------------------------------- -------  From MORCIH92%IRLEARN.UCD.IE at BITNET.CC.CMU.EDU Thu Mar 12 09:11:29 1992 From: MORCIH92%IRLEARN.UCD.IE at BITNET.CC.CMU.EDU (Michal Morciniec) Date: Thu, 12 Mar 92 14:11:29 GMT Subject: Internal representations Message-ID: <01GHJVGN5RV4D3YTH2@BITNET.CC.CMU.EDU> I am interested in the internal representations developing in hidden units as a result of training with BP algorithm. I have done some experiments with simple ( one hidden layer with 4 nodes ) network trained to recognise hand-written digits. I found that feature detectors developed during training ( with 500 patterns ) are quite 'similar' to those reported in [1]. Interesting question arises: IF training with different databases of patterns of certain class (in this case hand-printed digits) creates similar internal representations in hidden nodes of networks with similar architecture THEN is it possible to predict what features are likely to be developed as result of training with particular patterns ? Does anybody have a knowledge of research in this area ( i.e WHY such and not other features are created ) ? 1. Martin G.L., Pittman J.A, "Recognizing Hand-Printed Letters and Digits" Thanks for comments, ===================================================================== !! Michal Morciniec !! !! !! 3 Sweetmount Pk., !! !! !! Dundrum, Dublin 14, !! !! !! !! MORCIH92 at IRLEARN.UCD.IE !! !! Eire !! !! =====================================================================  From MURRE at rulfsw.LeidenUniv.nl Thu Mar 12 16:09:00 1992 From: MURRE at rulfsw.LeidenUniv.nl (MURRE@rulfsw.LeidenUniv.nl) Date: Thu, 12 Mar 1992 16:09 MET Subject: 68 neurosimulators Message-ID: <01GHK9WIA4408WW4MH@rulfsw.LeidenUniv.nl> We have now updated and extended our table with neurosimulators to include 68 neurosimulators. We present the table below. (Sorry, for the many bytes taken by this format. We expect that this format is easier to handle by everyone.) Work on the review paper, unfortunately, has been interrupted by several events. We plan to have something available within the next few months. In this paper we will ponder on the possibility of deriving some standards for a number of the 'most popular' neural networks. If we could agree on such a set, it would be much easier to directly exchange models and simulation scripts (at least, for this limited set of neural network paradigms). Has anyone ever worked on this? If anyone wants to point out errors, fill in some blanks, or prosose to add (or remove) a system from the list, please, follow the format of the table. Additional comments (i.e., extra references to be included in the general review paper, background information, or reasons why a certain entry is wrong) may then follow the changed lines. Example: The following line in the table ought to be changed to: Name Manufacturer Hardware METANET Leiden University IBM, MAC Within 6 months from now, a MAC version will be available for this system. Adherence to this format will make it much easier for us to deal with the comments. Jacob M.J. Murre Steven E. Kleyenmberg Jacob M.J. Murre Unit of Experimental and Theoretical Psychology Leiden University P.O. Box 9555 2300 RB Leiden The Netherlands E-mail: Murre at HLERUL55.Bitnet tel.: 31-71-273631 fax.: 31-71-273619 N.B. At April 1 1992, I will start working at the following address: Jacob M.J. Murre Medical Research Council: Applied Psychology Unit 15 Chaucer Road Cambridge CB2 2EF England E-mail: jaap.murre at mrc-apu.cam.ac.uk tel.: 44-223-355294 (ext.139) fax.: 44-223-359062 Table 1.a. Neurosimulators. Name Manufacturer Hardware -------------------------------------------------------------------------------- ADAPTICS Adaptic ANNE Oregon Graduate Center Intel iPSC hypercube ANSE TRW TWR neurocom. mark 3,4,5 ANSIM SAIC IBM ANSKIT SAIC ANSPEC SIAC IBM,MAC,SUN,VAX,SIGMA/DELTA AWARENESS Neural Systems IBM AXON HNC Inc. HNC Neurocom. ANZA,ANZA+ BOSS BPS George Mason Univ., Fairfax IBM,VAX,SUN BRAIN SIMULATOR Abbot,Foster & Hauserman IBM BRAINMAKER California Scientific Software IBM CABLE Duke University VAX CASCOR CASENET COGNITRON Cognitive Software MAC,IBM CONE IBM Palo Alto IBM CONNECTIONS IBM COPS Case Western Reserve Univ. CORTEX DESIRE/NEUNET IBM EXPLORENET 3000 HNC Inc. IBM,VAX GENESIS Neural Systems IBM GENESIS/XODUS VAX,SUN GRADSIM VAX GRIFFIN Texas Instruments/Cambridge TI NETSIM neurocomputer HYPERBRAIN Neurix Inc. MAC MACBRAIN Neurix Inc. MAC MACTIVATION University of Colorado MAC METANET Leiden University IBM,(VAX) MIRRORS/II University of Maryland VAX,SUN N-NET AIWare Inc. IBM,VAX N1000 Nestor Inc IBM,SUN N500 Nestor Inc. IBM NCS North Carolina State Univ. (portable) NEMOSYS IBM RS/6000 NESTOR Nestor Inc. IBM,MAC NET NETSET 2 HNC Inc. IBM,SUN,VAX NETWURKZ Dair Computer Systems IBM NEURALSHELL Ohio State University SUN NEURALWORKS NeuralWare Inc. IBM,MAC,SUN,NEXT,INMOS NEURDS Digtal Equipment Corporation VAX NEUROCLUSTERS VAX NEURON Duke University NEUROSHELL Ward Systems Group IBM NEUROSOFT HNC Inc. NEUROSYM NeuroSym Corp. IBM NEURUN Dare research IBM NN3/SESAME GMD, Sankt Augustin, BDR SUN NNSIM OPT OWL Olmsted & Watkins IBM,MAC,SUN,VAX P3 U.C.S.D. Symbolics PABLO PDP McClelland & Rumelhart IBM,MAC PLANET University of Colorado SUN,APOLLO,ALLIANT PLATO/ARISTOTLE NeuralTech PLEXI Symbolics Inc/Lucid Inc Symbolics,SUN POPLOG-NEURAL University of Sussex SUN,VAX PREENS Nijmegen University SUN PYGMALION Esprit SUN,VAX RCS Rochester University SUN,MAC SAVY TEXT RETR. SYS. Excalibur Technologies IBM,VAX SFINX U.C.L.A. SLONN Univ. of Southern California SNNS Stuttgart University SUN,DEC,HP,IBM SUNNET SUN Table 1.b. Neurosimulators. Name Language Models Price $ -------------------------------------------------------------------------------- ADAPTICS ANNE HLL/ILL/NDL ANSE ANSIM many 495 ANSKIT ANSPEC HLL many 995 AWARENESS 275 AXON HLL 1950 BOSS BPS C bp 100 BRAIN SIMULATOR 99 BRAINMAKER Macro bp 195 CABLE HLL CASCOR CASENET Prolog COGNITRON HLL (Lisp) many 600 CONE HLL CONNECTIONS hopf 87 COPS CORTEX DESIRE/NEUNET matrix EXPLORENET 3000 GENESIS 1095 GENESIS/XODUS C GRADSIM C GRIFFIN HYPERBRAIN 995 MACBRAIN many 995 MACTIVATION METANET HLL (C) many 1000 MIRRORS/II HLL (Lisp) several N-NET C bp 695 N1000 19000 N500 NCS HLL (C++) many NEMOSYS NESTOR 9950 NET NETSET 2 many 19500 NETWURKZ 80 NEURALSHELL C many NEURALWORKS C 1495 NEURDS C NEUROCLUSTERS NEURON HLL NEUROSHELL bp 195 NEUROSOFT NEUROSYM many 179 NEURUN bp NN3/SESAME many NNSIM OPT C OWL many 1495 P3 HLL many PABLO PDP several 44 PLANET HLL many PLATO/ARISTOTLE PLEXI Lisp,C,Pascal many POPLOG-NEURAL HLL,POP-11 bp,cl PREENS HLL many PYGMALION HLL (parallel C) many RCS C SAVY TEXT RETR. SYS. C SFINX HLL SLONN SNNS HLL many SUNNET Table 1.c. Neurosimulators. Name Comments -------------------------------------------------------------------------------- ADAPTICS training software for neural-networks ANNE neural-network development environment ANSE ANSIM ANSKIT development tool for large artificial neural-networks ANSPEC AWARENESS introductory NN program AXON neural-network description language BOSS BPS BRAIN SIMULATOR BRAINMAKER neural-networks simulation software CABLE CASCOR cascade-correlation simulator CASENET graphical case-tool for generating executable code COGNITRON neural-network,prototyping,delivery system CONE research environment CONNECTIONS COPS combinatorial optimization problems CORTEX neural-network graphics tool DESIRE/NEUNET interactive neural-networks experiment environment EXPLORENET 3000 stand-alone neural-network software GENESIS neural-network development system GENESIS/XODUS general neural simulator, X-wnd. output, simulation utilities GRADSIM GRIFFIN research environment for TI NETSIM neurocomputer HYPERBRAIN MACBRAIN MACTIVATION introductory neural-network simulator METANET general neurosimulator, CAD for NN architectures MIRRORS/II neurosimulator for parallel environments N-NET integrated neural-network development system N1000 N500 NCS NEMOSYS simulation software NESTOR NET NETSET 2 NETWURKZ training tool for IBM pc NEURALSHELL NEURALWORKS neural-networks development system NEURDS NEUROCLUSTERS simulation tool for biological neural networks NEURON NEUROSHELL NEUROSOFT NEUROSYM NEURUN interactive neural-network environment NN3/SESAME neurosimulator for modular neural networks NNSIM mixed neural/digital image processing system OPT all-purpose simulator OWL P3 early PDP development system PABLO PDP introductory simulator, complements 'the PDP volumes' PLANET PLATO/ARISTOTLE knowledge processor for expert systems PLEXI flexible neurosimulator with graphical interaction POPLOG-NEURAL PREENS workbench for NN constr., visualisation,man., and simul. PYGMALION general, parallel neurosimulator under X-Windows RCS research environment, graphical neurosimulator SAVY TEXT RETRIEVAL SYSTEM SFINX research environment SLONN SNNS SUNNET Table 1.d. Neurosimulators. Name Abbreviated reference -------------------------------------------------------------------------------- ADAPTICS ANNE ANSE ANSIM [Cohen, H., Neural Network Review, 3, 102-133, 1989] ANSKIT [Barga R.S, Proc. IJCNN-90-Washington DC, 2, 94-97, 1990] ANSPEC AWARENESS [BYTE, 14(8), 244-245, 1989] AXON [BYTE, 14(8), 244-245, 1989] BOSS [Reggia J.A., Simulation, 51, 5-19, 1988] BPS BRAIN SIMULATOR BRAINMAKER [BYTE, 14(8), 244-245, 1989] CABLE [Miller J.P., Nature, 347, 783-784, 1990] CASCOR CASENET [Dobbins R.W, Proc. IJCNN-90-Wash. DC, 2, 122-125, 1990] COGNITRON [BYTE, 14(8), 244-245, 1989] CONE CONNECTIONS [BYTE, 14(8), 244-245, 1989] COPS [Takefuji Y., Science, 245, 1221-1223, 1990] CORTEX [Reggia J.A., Simulation, 51, 5-19, 1988] DESIRE/NEUNET [Korn G.A, Neural Networks, 2, 229-237, 1989] EXPLORENET 3000 [BYTE, 14(8), 244-245, 1989] GENESIS [Miller J.P., Nature, 347, 783-784, 1990] GENESIS/XODUS GRADSIM GRIFFIN HYPERBRAIN [BYTE, 14(8), 244-245, 1989] MACBRAIN [BYTE, 14(8), 244-245, 1989] MACTIVATION METANET [Murre J.M.J., Proc. ICANN-91-FIN, 1, 545-550, 1991] MIRRORS/II [Reggia, J.A., Simulation, 51, 5-19, 1988] N-NET [BYTE, 14(8), 244-245, 1989] N1000 [BYTE, 14(8), 244-245, 1989] N500 [BYTE, 14(8), 244-245, 1989] NCS NEMOSYS [Miller J.P., Nature, 347, 783-784, 1990] NESTOR NET [Reggia J.A., Simulation, 51, 5-19, 1988] NETSET 2 NETWURKZ [BYTE, 14(8), 244-245, 1989] NEURALSHELL NEURALWORKS [BYTE, 14(8), 244-245, 1989] NEURDS NEUROCLUSTERS NEURON [Miller J.P., Nature, 347, 783-784, 1990] NEUROSHELL [BYTE, 14(8), 244-245, 1989] NEUROSOFT NEUROSYM NEURUN NN3/SESAME NNSIM [Nijhuis J.L., Microproc. & Microprogr., 27,189-94, 1989] OPT OWL [BYTE, 14(8), 244-245, 1989] P3 [In: 'PDP Volume 1', MIT Press, 488-501, 1986] PABLO PDP [Rumelhart et al. 'Explorations in PDP', MIT Press, 1988] PLANET PLATO/ARISTOTLE PLEXI POPLOG-NEURAL PREENS PYGMALION RCS SAVY TEXT RETR. SYS. [BYTE, 14(8), 244-245, 1989] SFINX [Mesrobian E., IEEE Int. Conf. on Man, Sys. & Cyb., 1990] SLONN [Simulation, 55, 69-93, 1990] SNNS SUNNET Explanation of abbreviations and terms: Manufacturer: company, institute, or researchers associated with the system Languages: HLL = High Level Language (i.e., network definition language; if specific programming languages are mentioned, networks can be defined using high-level functions in these languages) Models: several = a fixed number of models is (and will be) supported many = the systems can be (or will be) extended with new models bp = backpropagation (if specific models are mentioned, these are the only ones supported by the system) hopf = hopfield cl = competitive learning Price: indication of price range in US dollars (if no price is this can either mean that the price is unknown to us, that the system is not available (yet) for general distribution, or that the system is available at a nominal charge) Comment: attempt to indicate the primary function of the system Reference: a single reference that contains pointers to the manufacturers, who may be contacted for further information (a more complete list of references, also containing review articles, etc., will appear in a general review paper by us - this paper is still in preparation and not yet available for prelimary distribution [sorry])   From rsun at orion.ssdc.honeywell.com Thu Mar 12 15:58:57 1992 From: rsun at orion.ssdc.honeywell.com (Ron Sun) Date: Thu, 12 Mar 92 14:58:57 CST Subject: No subject Message-ID: <9203122058.AA01720@orion.ssdc.honeywell.com> TR availble: (comments and suggestions are welcome) ------------------------------------------------------------------ An Efficient Feature-based Connectionist Inheritance Scheme Ron Sun Honeywell SSDC 3660 Technology Drive Minneapolis, MN 55418 The paper describes how a connectionist architecture deals with the inheritance problem in an efficient and natural way. Based on the connectionist architecture CONSYDERR, we analyze the problem of property inheritance and formulate it in ways facilitating conceptual clarity and implementation. A set of ``benchmarks" is specified for ensuring the correctness of inheritance mechanisms. Parameters of CONSYDERR are formally derived to satisfy these benchmark requirements. We discuss how chaining of is-a links and multiple inheritance can be handled in this architecture. This paper shows that CONSYDERR with a two-level dual (localist and distributed) representation can handle inheritance and cancellation of inheritance correctly and extremely efficiently, in constant time instead of proportional to the length of a chain in an inheritance hierarchy. It also demonstrates the utility of a meaning-oriented, intensional approach (with features) for supplementing and enhancing extensional approaches. ---------------------------------------------------------------- It is FTPable from archive.cis.ohio-state.edu in: pub/neuroprose (Courtesy of Jordan Pollack) No hardcopy available. FTP procedure: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get sun.inh.ps.Z ftp> quit unix> uncompress sun.inh.ps.Z unix> lpr sun.inh.ps (or however you print postscript)  From FEGROSS at weizmann.weizmann.ac.il Fri Mar 13 02:58:21 1992 From: FEGROSS at weizmann.weizmann.ac.il (Tal Grossman) Date: Fri, 13 Mar 92 09:58:21 +0200 Subject: linear separability Message-ID: The issue of verifying whether a given set of vectors is linearly separable or not was discussed time and again in this forum, and it is certainly very interesting for many of us. I will therefore remind here a few old refs. and add one (hopefully relevant) insight and one new (surely relevant) reference. The basic geometrical approach is that of the convex hulls of the two classes: if their intersection is non empty, then the two sets are not linearly separable (and vice versa). This method is really old (the Highleyman paper). Related methods can be found in Lewis and Coates, "Threshold Logic" (Wiley 1967). In terms of computational complexity these methods are not any better than linear programming. About the idea of using the distances among the vectors: I suggest the following simple experiment - For large N (say 100), create P random, N dimensional binary vectors, and then make a histogram of the hamming distance between all pairs. Compare such histograms for P<2N (in this case the set is almost always linearly separable) and for P>2N (and then it is almost always non l.s.). You will find no difference. Which shows that as it is presented, your approach can not help in verifying linear separability. The last item is a new perceptron type learning algorithm by Nabutovsky and Domany (Neural Computation 3 (1991) 604). It either finds a solution (namely, a separating vector) to a given set, or stops with a definite conclusion that the problem is non separable. and of course I would be glad to hear about any new idea - Tal Grossman (fegross at weizmann) Electronics Dept. Weizmann Inst. of Science Rehovot 76100 ISRAEL  From smieja at jargon.gmd.de Fri Mar 13 10:27:11 1992 From: smieja at jargon.gmd.de (Frank Smieja) Date: Fri, 13 Mar 92 16:27:11 +0100 Subject: TR (reflective) Message-ID: <9203131527.AA27751@jargon.gmd.de> -) ******************************************************************* -) REFLECTIVE MODULAR NEURAL NETWORK SYSTEMS -) -) F. J. Smieja and H. Muehlenbein -) -) German National Research Centre for Computer Science (GMD) -) Schlo{\ss} Birlinghoven, -) 5205 St. Augustin 1, -) Germany. -) -) ABSTRACT -) -) Many of the current artificial neural network systems have serious -) limitations, concerning accessibility, flexibility, scaling and -) reliability. In order to go some way to removing these we suggest a -) {\it reflective neural network architecture}. In such an architecture, -) the modular structure is the most important element. The -) building-block elements are called ``\MINOS'' modules. They perform -) {\it self-observation\/} and inform on the current level of -) development, or scope of expertise, within the module. A {\it -) Pandemonium\/} system integrates such submodules so that they work -) together to handle mapping tasks. Network complexity limitations are -) attacked in this way with the Pandemonium problem decomposition -) paradigm, and both static and dynamic unreliability of the whole -) Pandemonium system is effectively eliminated through the generation -) and interpretation of {\it confidence\/} and {\it ambiguity\/} -) measures at every moment during the development of the system. -) -) Two problem domains are used to test and demonstrate various aspects -) of our architecture. {\it Reliability\/} and {\it quality\/} measures -) are defined for systems that only answer part of the time. Our system -) achieves better quality values than single networks of larger size for -) a handwritten digit problem. When both second and third best answers -) are accepted, our system is left with only 5\% error on the test set, -) 2.1\% better than the best single net. It is also shown how the -) system can elegantly learn to handle garbage patterns. With the -) parity problem it is demonstrated how complexity of problems may be -) decomposed automatically by the system, through solving it with -) networks of size smaller than a single net is required to be. Even -) when the system does not find a solution to the parity problem, -) because networks of too small a size are used, the reliability remains -) around 99--100\%. -) -) Our Pandemonium architecture gives more power and flexibility to the -) higher levels of a large hybrid system than a single net system can, -) offering useful information for higher-level feedback loops, through -) which reliability of answers may be intelligently traded for less -) reliable but important ``intuitional'' answers. In providing weighted -) alternatives and possible generalizations, this architecture gives the -) best possible service to the larger system of which it will form part. -) -) Keywords: Reflective architecture, Pandemonium, task decomposition, -) confidence, reliability. -) ******************************************************************** -) -) -) ---------------------------------------------------------------- -) FTP INSTRUCTIONS -) -) unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52) -) Name: anonymous -) Password: neuron -) ftp> cd pub/neuroprose -) ftp> binary -) ftp> get smieja.reflect.ps.Z -) ftp> bye -) unix% zcat smieja.reflect.ps.Z | lpr -) (or whatever *you* do to print a compressed PostScript file) -) ---------------------------------------------------------------- -) Apparently the original format was such that it was not possible to print out on American-sized paper. Therefore I have changed the format and re-inserted the file smieja.reflect.ps.Z into the neuroprose archive. It should be all on the sheet now. Instructions as before. -Frank Smieja  From LWCHAN at CUCSD.CUHK.HK Fri Mar 13 04:31:00 1992 From: LWCHAN at CUCSD.CUHK.HK (LAI-WAN CHAN) Date: Fri, 13 Mar 1992 17:31 +0800 Subject: Internal representations Message-ID: <7C58C87DE020033B@CUCSD.CUHK.HK> > I am interested in the internal representations developing in hidden > units as a result of training with BP algorithm. I have done some I have done experiments to find out the internal representations of the BP net [1]. I used some training sets and looked at their hidden nodes. The hidden nodes showed particular arrangement (e.g. residing on a circle) for some training patterns. I did not include any results on the digit recognition but I found some hidden nodes have been trained to be responsible for some feature detection. [1] Analysis of the Internal Representations in Neural Networks for Machine Intelligence, Lai-Wan CHAN, AAAI-91, Vol.2, p578-583, 1991. Lai-Wan Chan, Computer Science Dept, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. email : lwchan at cucsd.cuhk.hk tel : (+852) 609 8865 FAX : (+852) 603 5024  From kamil at apple.com Fri Mar 13 16:44:36 1992 From: kamil at apple.com (Kamil A. Grajski) Date: Fri, 13 Mar 92 13:44:36 -0800 Subject: Summer Internships at Apple Computer Message-ID: <9203132144.AA28397@apple.com> Unofficial announcement Summer Internships at Apple Computer in Cupertino, CA The Speech & Language Technologies Department in Apple's Advanced Technology Group has summer internship positions available. The typical intern experience is to focus on one project under the close supervision of one or more senior researchers/engineers. In the past, intern projects in this group have resulted in lasting contributions - not just busy work! There is a formal review process, end of summer presentation, etc. There are corporate-wide summer intern social and professional events, too. Qualifications can span several areas. Upper division undergraduate, or early graduate students preferred in the following areas: a.) general speech processing; b.) front-end signal processing, ; c.) statistical pattern recognition, e.g., HMMs, general methods; d.) speech synthesis; e.) natural language; and f.) Macintosh (MPW) programming. This is not an exhasutive list, but the main point is that the candidate should have a strong committment to do doing really great work in speech technology. I apologize in advance that I will NOT be able to acknowledge each and every inquiry, individually. Kamil A. Grajski kamil at apple.com  From thildebr at aragorn.csee.lehigh.edu Sat Mar 14 13:39:33 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Sat, 14 Mar 92 13:39:33 -0500 Subject: Internal representations In-Reply-To: LAI-WAN CHAN's message of Fri, 13 Mar 1992 17:31 +0800 <7C58C87DE020033B@CUCSD.CUHK.HK> Message-ID: <9203141839.AA13493@aragorn.csee.lehigh.edu> See also: Bunpei Irie and Mitsuo Kawato, "Acquisition of internal representation by multi-layered perceptrons", Denshi Joohoo Tsuushin Gakkai Ronbunshi (Tr. of the Institute of Electronic Communication Engineers (of Japan)), V.J73-D-II, N.8, pp.1173-1178 (Aug 1990), in Japanese. My translation of this article will appear in Systems and Computers in Japan (Scripta Technica, Silver Spring, MD), but may be obtained directly (sans figures) by sending me an e-mail request. Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From FEDIMIT at weizmann.weizmann.ac.il Sun Mar 15 14:11:02 1992 From: FEDIMIT at weizmann.weizmann.ac.il (Dan Nabutovsky) Date: Sun, 15 Mar 92 21:11:02 +0200 Subject: Linear nonseparability Message-ID: > From: christoph bruno herwig > I was wondering if someone could point me in the right direction > concerning the following fundamental separability problem: > Given a binary (-1/1) valued training set consisting of n-dimensional > input vectors (homogeneous coordinates: n-1 inputs and a 1 for the bias > term as the n-th dimension) and 1-dimensional target vectors. For this > 2-class classification problem I wish to prove (non-) linear separability > solely on the basis of the given training set (hence determine if > the underlying problem may be solved with a 2-layer feedforward network). An algorithm that solves this problem is described in our paper: D. Nabutovsky & E. Domany, "Learning the Unlearnable", Neural Computation 3(1991), 604. We present perceptron learning rule that finds separation plane when a set of patterns is linearly separable, and proves linear non-separability otherwise. Our approach is completely different from those described by Christoph. Our idea is to do perceptron learning, always keeping in mind constraint for the distance between current vector and solution. When this constraint becomes impossible, nonseparability is proved. Using sophisticated choice of learning step size, we ensure that algorithm always finds a solution or proves its absence in a finite number of steps. Dan Nabutovsky (FEDIMIT at WEIZMANN.WEIZMANN.AC.IL)  From bradley at ivy.Princeton.EDU Sun Mar 15 19:21:16 1992 From: bradley at ivy.Princeton.EDU (Bradley Dickinson) Date: Sun, 15 Mar 92 19:21:16 EST Subject: Nominations Sought for IEEE NNC Awards Message-ID: <9203160021.AA26259@ivy.Princeton.EDU> Nominations Sought for IEEE Neural Networks Council Awards The IEEE Neural Networks Council is soliciting nominations for its two awards. The awards will be presented at the June 1992 International Joint Conference on Neural Networks. Nominations for these awards should be submitted in writing according to the instructions given below. ------------------------------------------------------------------ IEEE Transactions on Neural Networks Outstanding Paper Award This is an award of $500 for the outstanding paper published in the IEEE Transactions on Neural Networks in the previous two-year period. For 1992, all papers published in 1990 (Volume 1) and in 1991 (Volume 2) in the IEEE Transactions on Neural Networks are eligible. For a paper with multiple authors, the award will be shared by the coauthors. Nominations must include a written statement describing the outstanding characteristics of the paper. The deadline for receipt of nominations is April 20, 1992. Nominations should be sent to Prof. Bradley W. Dickinson, NNC Awards Chair, Dept. of Electrical Engineering, Princeton University, Princeton, NJ 08544-5263. ------------------------------------------------------------------------ IEEE Neural Networks Council Pioneer Award This award has been established to recognize and honor the vision of those people whose efforts resulted in significant contributions to the early concepts and developments in the neural networks field. Up to three awards may be presented annually to outstanding individuals whose main contribution has been made at least fifteen years earlier. The recognition is engraved on the Neural Networks Pioneer Medal specially struck for the Council. Selection of Pioneer Medalists will be based on nomination letters received by the Pioneer Awards Committee. All who meet the contribution requirements are eligible, and anyone can nominate. The award is not approved posthumously. Written nomination letters must include a detailed description of the nominee's contributions and must be accompanied by full supporting documentation. For the 1992 Pioneer Award, nominations must be received by April 20, 1992. Nominations should be sent to Prof. Bradley W. Dickinson, NNC Pioneer Award Chair, Department of Electrical Engineering, Princeton University, Princeton, NJ 08544-5263. ----------------------------------------------------------------------------:x Questions and preliminary inquiries about the above awards should be directed to Prof. Bradley W. Dickinson, NNC Awards Chair; telephone: (609)-258-2916, electronic mail: bradley at ivy.princeton.edu  From dhw at santafe.edu Fri Mar 13 19:18:48 1992 From: dhw at santafe.edu (David Wolpert) Date: Fri, 13 Mar 92 17:18:48 MST Subject: New paper Message-ID: <9203140018.AA25119@sfi.santafe.edu> ******* DO NOT FORWARD TO OTHER LISTS ***************** The following paper has been placed in neuroprose: A RIGOROUS INVESTIGATION OF "EVIDENCE" AND "OCCAM FACTORS" IN BAYESIAN REASONING by David H. Wolpert Abstract: This paper first reviews the reasoning behind the Bayesian "evidence" procedure for setting parameters in the probability distributions involved in inductive inference. This paper then proves that the evidence procedure is incorrect. More precisely, this paper proves that the assumptions going into the evidence procedure do not, as claimed, "let the data determine the distributions". Instead, those assumptions simply amount to an implicit replacement of the original distributions, containing free parameters, with new distributions, none of whose parameters are free. For example, as used by MacKay [1991] in the context of neural nets, the evidence procedure is a means for using the training set to determine the free parameter alpha in the the prior distribution P({wi}) proportional to exp(alpha x S), where the N wi are the N weights in the network, and S is the sum of the squares of those weights. As this paper proves, in actuality the assumptions going into MacKay's use of the evidence procedure do not result in a distribution P({wi}) proportional to exp(alpha x S), for some alpha, but rather result in a parameter-less distribution, P({wi}) proportional to (S) ** [-(N/2 + 1)]. This paper goes on to prove that if one makes the assumption of an "entropic prior" with unknown parameter value, in addition to the assumptions used in the evidence procedure, then the prior is completely fixed, but in a form which can not be entropic. (This calls into question the self-consistency of the numerous arguments purporting to derive an entropic prior "from first principles".) Finally, this paper goes on to investigate the Bayesian first-principles "proof" of Occam's razor involving Occam factors. This paper proves that that "proof" is flawed. To retrieve this file, do the following: unix> ftp archive.cis.ohio-state.edu Name (archive.cis.ohio-state.edu:dhw): anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get wolpert.evidence.ps.Z ftp> quit unix> uncompress wolpert.evidence.ps.Z unix> lpr wolpert.evidence.ps  From marchman at merlin.psych.wisc.edu Mon Mar 16 15:03:10 1992 From: marchman at merlin.psych.wisc.edu (Virginia Marchman) Date: Mon, 16 Mar 92 14:03:10 -0600 Subject: TR AVAILABLE Message-ID: <9203162003.AA11810@merlin.psych.wisc.edu> ********************************************************************* CENTER FOR RESEARCH IN LANGUAGE UNIVERSITY OF CALIFORNIA, SAN DIEGO Technical Report #9201 ********************************************************************* LANGUAGE LEARNING IN CHILDREN AND NEURAL NETWORKS: PLASTICITY, CAPACITY AND THE CRITICAL PERIOD Virginia A. Marchman Department of Psychology, University of Wisconsin, Madison ABSTRACT This paper investigates constraints on dissociation and plasticity using connectionist models of the acquisition of an artificial language analogous to the English past tense. Several networks were "lesioned" in varying amounts both prior to and after the onset of training. In Study I, the network was trained on mappings similar to English regular verbs (e.g., walk ==> walked). Long term effects of injury were not observed in this simple homogeneous task, yet trajectories of development were dampened in relation to degree of damage prior to training, and post-natal lesions resulted in substantive short term performance deficits. In Study II, the vocabulary was comprised of regular, as well as irregular verbs (e.g., go ==> went). In intact nets, the acquisition of the regulars was considerably slowed, and performance was increasingly susceptible to injury, both acutely and in terms of eventual recovery, as a function of size and time of lesion. In contrast, irregulars were learned quickly and were relatively impervious to the effects of injury. Generalization to novel forms indicates that these behavioral dissociations result from the competition between the two classes of forms within a single mechanism system, rather than a selective disruption of the mechanism guiding the learning of regular forms. Two general implications for research on language development and breakdown are discussed: (1) critical period effects may derive from prior learning history in interaction with the language to be learned ("entrenchment"), rather than endogenously determined maturational change, and (2) selective dissociations in behavior CAN result from general damage in systems that are *not* modularized in terms of rule based vs. associative mechanisms (cf. Pinker, 1991). ********************************************************************* Hard copies of this report are available upon request from John at staight at crl.ucsd.edu. Please ask for CRL TR #9201, and provide your surface mailing address. In addition, this TR can be retrieved via anonymous ftp from the pub/neuralnets directory at crl.ucsd.edu. The entire report consists of 8 postscript files (1 text file, 7 files of figures). In order to ease retrieval, we have compiled these into a single tar file that must be extracted before printing. (Report is 20 pages total). Instructions: unix> ftp crl.ucsd.edu Connected to crl.ucsd.edu. 220 crl local FTP server (Version 5.85cub) ready. Name: anonymous 331 Guest login ok, send email address as password. Password: my-email-address ftp> cd pub/neuralnets 250 CWD command successful. ftp> binary 200 Type set to I. ftp> get tr9201.tar.Z ftp> bye unix> uncompress tr9201.tar.Z [MUST EXTRACT TAR FILE BEFORE PRINTING] unix> tar -xf tr9201.tar [RESULT IS 8 POSTSCRIPT FILES] unix> lpr tr9201.*.ps [or however you send your files to your postscript printer]  From thildebr at athos.csee.lehigh.edu Mon Mar 16 14:24:09 1992 From: thildebr at athos.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Mon, 16 Mar 92 14:24:09 -0500 Subject: Paper in neuroprose Message-ID: <9203161924.AA02714@athos.csee.lehigh.edu> The following paper, which has been submitted to IEEE Transactions on Neural Networks, is now available in PostScript format through the neuroprose archive: "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt ABSTRACT We compare the convergence properties of the batch and per-sample versions of the standard backpropagation algorithm. The comparison is made on the basis of ideal step sizes computed for the two algorithms with respect to a simplified, linear problem. For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case. It is found that the maximum fixed safe step size is $P$ times smaller for the batch version than for the per-sample version, where $P$ is the number of training examples. This fact is balanced somewhat by the fact that batch algorithm sums $P$ substeps in order to compute its step, meaning that the steps taken by the two algorithms are comparable in size. However, the batch algorithm takes only one step per epoch while the per-sample algorithm takes $P$. Thus, the conclusion is that the batch algorithm is $P$ times slower in a serial implementation. In response to last Fall's discussion involving Yann LeCun, Kamil Grajski and others regarding the unexpectedly poor performance of parallel implementations of the batch backpropagation algorithm, I performed an analysis of the convergence speed of batch and per-sample versions of the backpropagation algorithm based on calculation of the ideal step size. The conclusion is that, even if there are as many processors as training samples, the parallel implementation of a batch algorithm which does not alter its step size adaptively during an epoch can never be faster than the serial implementation of the per-sample algorithm. Due to the manner in which the problem is approached, it does not exactly go beyond the "multiple-copies" argument, as desired by LeCun. However, it does succeed in formalizing that argument. In the process, it also defines a relative measure of "redundancy" in the training set as correlation (the degree of collinearity) between the training vectors in the input space. Such a measure can be computed directly before training is begun. To obtain copies of this article: unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52) Name : anonymous Password: ftp> cd pub/neuroprose ftp> binary ftp> get hildebrandt.batch.ps.Z ftp> quit unix> uncompress hildebrandt.batch.ps.Z unix> lpr -Pps hildebrandt.batch.ps (or however you print PostScript) (Thanks to Jordan Pollack for providing this valuable service to the NN Research community.) Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From xiru at Think.COM Wed Mar 18 10:33:51 1992 From: xiru at Think.COM (xiru Zhang) Date: Wed, 18 Mar 92 10:33:51 EST Subject: Paper in neuroprose In-Reply-To: "Thomas H. Hildebrandt "'s message of Mon, 16 Mar 92 14:24:09 -0500 <9203161924.AA02714@athos.csee.lehigh.edu> Message-ID: <9203181533.AA01429@yangtze.think.com> Date: Mon, 16 Mar 92 14:24:09 -0500 From: "Thomas H. Hildebrandt " The following paper, which has been submitted to IEEE Transactions on Neural Networks, is now available in PostScript format through the neuroprose archive: "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt ABSTRACT We compare the convergence properties of the batch and per-sample versions of the standard backpropagation algorithm. The comparison is made on the basis of ideal step sizes computed for the two algorithms with respect to a simplified, linear problem. For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case. It is found that the maximum fixed safe step size is $P$ times smaller for the batch version than for the per-sample version, where $P$ is the number of training examples. This fact is balanced somewhat by the fact that batch algorithm sums $P$ substeps in order to compute its step, meaning that the steps taken by the two algorithms are comparable in size. However, the batch algorithm takes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only one step per epoch while the per-sample algorithm takes $P$. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Thus, the conclusion is that the batch algorithm is $P$ times slower ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ in a serial implementation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The last argument is not sound: the directions computed based on one training example and that based on all training examples can be very different, thus even if the step size is the same, the convergence rate can be different. This may not be a serious problem for the example in your paper, where the network has linear (identity) activation function and no hidden units, but in a multiple-layer network with non-linear units, not only the step size is important, the direction of the step is at least equally important. - Xiru Zhang  From thildebr at aragorn.csee.lehigh.edu Wed Mar 18 14:37:17 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Wed, 18 Mar 92 14:37:17 -0500 Subject: Paper in neuroprose In-Reply-To: xiru Zhang's message of Wed, 18 Mar 92 10:33:51 EST <9203181533.AA01429@yangtze.think.com> Message-ID: <9203181937.AA18295@aragorn.csee.lehigh.edu> From: xiru Zhang Date: Wed, 18 Mar 92 10:33:51 EST Excerpt from my abstract: However, the batch algorithm takes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ only one step per epoch while the per-sample algorithm takes $P$. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Thus, the conclusion is that the batch algorithm is $P$ times slower ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ in a serial implementation. ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Comment by Zhang: The last argument is not sound: the directions computed based on one training example and that based on all training examples can be very different, thus even if the step size is the same, the convergence rate can be different. This may not be a serious problem for the example in your paper, where the network has linear (identity) activation function and no hidden units, but in a multiple-layer network with non-linear units, not only the step size is important, the direction of the step is at least equally important. - Xiru Zhang My rebuttal: The direction of the step taken by batch BP and the direction taken by per-sample BP are BOTH BAD, in the sense that they do not point directly at the minimum, but toward the bottom of the valley in the direction of steepest descent -- a different thing entirely. In the simplified case I examine, the batch algorithm steps daintily to the bottom of the valley after one epoch. In the mean time, the P-S algorithm is responding in turn to the individual components of the error function. In doing so, it often crosses the center of the valley in the total (batch) error surface. As a result, P-S takes larger steps on the average, and ends up closer to the minimum than does batch. In addition, this overstepping of the minimum makes the P-S version of the algorithm more robust toward local minima. This subject is dealt with more fully in the text of the paper. One thing I did not note in my paper is that, in the linear case, a batch step will take you to the bottom of the valley by the shortest route. This reduces the dimensionality of the search by 1, so after a number of epochs equal to the number of dimensions in your search space, you know that you are at the minimum. However, pleasant this mathematical fiction is, when you add nonlinearity to the picture, all bets are off. It MAY BE that the next step taken by the batch algorithm will place it squarely at the bottom of the valley, where it will remain. The same can be true for the P-S algorithm. However possible this may be, to me it does not appear probable. The paper I have posted begs to be followed by one which analyzes the likelihood of such a lucky step occurring, or at least gives a stochastic view of the paths which the two algorithms follow in approaching the minimum. I will look into it, time permitting. Viewed intuitively: In any guessing game, you are likely to do better if the number of queries you are allowed to make increases. The batch algorithm allows itself one query (as to the local slope of the error surface) per epoch, while the per-sample algorithm gets $P$ times as many. Thomas H. Hildebrandt CSEE Department Lehigh University  From TEPPER at CVAX.IPFW.INDIANA.EDU Wed Mar 18 15:42:20 1992 From: TEPPER at CVAX.IPFW.INDIANA.EDU (TEPPER@CVAX.IPFW.INDIANA.EDU) Date: Wed, 18 Mar 1992 15:42:20 -0500 (EST) Subject: PDP & NN5 Message-ID: <920318154220.20c023e3@CVAX.IPFW.INDIANA.EDU> Fifth NN & PDP CONFERENCE PROGRAM - April 9, 10 and 11,1992 ----------------------------------------------------------- The Fifth Conference on Neural Networks and Parallel Distributed Processing at Indiana University-Purdue University at Fort Wayne will be held April 9, 10, and 11, 1992. Conference registration is $20 (on site). Students and members or employees of supporting organizations attend free. Some limited financial support might also be available to allow students to attend. Inquiries should be addressed to: US mail: ------- Pr. Samir Sayegh Physics Department Indiana University-Purdue University Fort Wayne, IN 46805-1499 email: sayegh at ipfwcvax.bitnet ----- FAX: (219)481-6880 --- Voice: (219) 481-6306 OR 481-6157 ----- All talks will be held in Kettler Hall, Room G46: Thursday, April 9, 6pm-9pm; Friday Morning & Afternoon (Tutorial Sessions), 8:30am-12pm & 1pm-4:30pm and Friday Evening 6pm-9pm; Saturday, 9am-12noon. Parking will be available near the Athletic Building or at any Blue A-B parking lots. Do not park in an Orange A lot or you may get a parking violation ticket. Special hotel rates (IPFW corporate rates) are available at Canterbury Green, which is a 5 minute drive from the campus. The number is (219) 485-9619. The Marriott Hotel also has corporate rates for IPFW and is about a 10 minute drive. Their number is (219) 484-0411. Another hotel with corporate rates for IPFW is Don Hall's Guesthouse (about 10 minutes away). Their number is (219) 489-2524. The following talks will be presented: Applications I - Thursday 6pm-7:30pm -------------------------------------- Nasser Ansari & Janusz A. Starzyk, Ohio University. DISTANCE FIELD APPROACH TO HANDWRITTEN CHARACTER RECOGNITION Thomas L. Hemminger & Yoh-Han Pao, Case Western Reserve University. A REAL- TIME NEURAL-NET COMPUTING APPROACH TO THE DETECTION AND CLASSIFICATION OF UNDERWATER ACOUSTIC TRANSIENTS Seibert L. Murphy & Samir I. Sayegh, Indiana-Purdue University. ANALYSIS OF THE CLASSIFICATION PERFORMANCE OF A BACK PROPAGATION NEURAL NETWORK DESIGNED FOR ACOUSTIC SCREENING S. Keyvan, L. C. Rabelo, & A. Malkani, Ohio University. NUCLEAR DIAGNOSTIC MONITORING SYSTEM USING ADAPTIVE RESONANCE THEORY J.L. Fleming & D.G. Hill, Armstrong Lab, Brooks AFB. STUDENT MODELING USING ARTIFICIAL NEURAL NETWORKS Biological and Cooperative Phenomena Optimization I - Thursday 7:50pm-9pm --------------------------------------------------------------------------- Ljubomir T. Citkusev & Ljubomir J., Buturovic, Boston University. NON- DERIVATIVE NETWORK FOR EARLY VISION Yalin Hu & Robert J. Jannarone, University of South Carolina. A NEUROCOMPUTING KERNEL ALGORITHM FOR REAL-TIME, CONTINUOUS COGNITIVE PROCESSING M.B. Khatri & P.G. Madhavan, Indiana-Purdue University, Indianapolis. ANN SIMULATION OF THE PLACE CELL PHENOMENON USING CUE SIZE RATIO Mark M. Millonas, University of Texas at Austin. CONNECTIONISM AND SWARM INTELLIGENCE --------------------------------------------------------------------------- --------------------------------------------------------------------------- Tutorials I - Friday 8:30am-11:45am ------------------------------------- Bill Frederick, Indiana-Purdue University. INTRODUCTION TO FUZZY LOGIC Helmut Heller, University of Illinois. INTRODUCTION TO TRANSPUTER SYSTEMS Arun Jagota, SUNY-Buffalo. THE HOPFIELD NETWORK, ASSOCIATIVE MEMORIES, AND OPTIMIZATION Tutorials II - Friday 1:15pm-4:30pm ------------------------------------- Krzysztof J. Cios, University Of Toledo. SELF-GENERATING NEURAL NETWORK ALGORITHM : CID3 APPLICATION TO CARDIOLOGY Robert J. Jannarone, University of South Carolina. REAL-TIME NEUROCOMPUTING, AN INTRODUCTION Network Analysis I - Friday 6pm-7:30pm ---------------------------------------- M.R. Banan & K.D. Hjelmstad, University of Illinois at Urbana-Champaign. A SUPERVISED TRAINING ENVIRONMENT BASED ON LOCAL ADAPTATION, FUZZINESS, AND SIMULATION Pranab K. Das II, University of Texas at Austin. CHAOS IN A SYSTEM OF FEW NEURONS Arun Maskara & Andrew Noetzel, University Heights. FORCED LEARNING IN SIMPLE RECURRENT NEURAL NETWORKS Samir I. Sayegh, Indiana-Purdue University. SEQUENTIAL VS CUMULATIVE UPDATE: AN EXPANSION D.A. Brown, P.L.N. Murthy, & L. Berke, The College of Wooster. SELF- ADAPTATION IN BACKPROPAGATION NETWORKS THROUGH VARIABLE DECOMPOSITION AND OUTPUT SET DECOMPOSITION Applications II - Friday 7:50pm-9pm ------------------------------------- Susith Fernando & Karan Watson, Texas A & M University. ANNs TO INCORPORATE ENVIRONMENTAL FACTORS IN HI FAULTS DETECTION D.K. Singh, G.V. Kudav, & T.T. Maxwell, Youngstown State University. FUNCTIONAL MAPPING OF SURFACE PRESSURES ON 2-D AUTOMOTIVE SHAPES BY NEURAL NETWORKS K. Hooks, A. Malkani, & L. C. Rabelo, Ohio University. APPLICATION OF ARTIFICIAL NEURAL NETWORKS IN QUALITY CONTROL CHARTS B.E. Stephens & P.G. Madhavan, Purdue University at Indianapolis. SIMPLE NONLINEAR CURVE FITTING USING THE ARTIFICIAL NEURAL NETWOR ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Network Analysis II - Saturday 9am-10:30am ------------------------------------------- Sandip Sen, University of Michigan. NOISE SENSITIVITY IN A SIMPLE CLASSIFIER SYSTEM Xin Wang, University of Southern California. DYNAMICS OF DISCRETE-TIME RECURRENT NEURAL NETWORKS: PATTERN FORMATION AND EVOLUTION Zhenni Wang and Christine Di Massimo, University of Newcastle. A PROCEDURE FOR DETERMINING THE CANONICAL STRUCTURE OF MULTILAYER NEURAL NETWORKS Srikanth Radhakrishnan, Tulane University. PATTERN CLASSIFICATION USING THE HYBRID COULOMB ENERGY NETWORK Biological and Cooperative Phenomena Optimization II - Saturday 10:50am-12noon ------------------------------------------------------------------------------- J. Wu, M. Penna, P.G. Madhavan, & L. Zheng, Purdue University at Indianapolis. COGNITIVE MAP BUILDING AND NAVIGATION C. Zhu, J. Wu, & Michael A. Penna, Purdue University at Indianapolis. USING THE NADEL TO SOLVE THE CORRESPONDENCE PROBLEM Arun Jagota, SUNY-Buffalo. COMPUTATIONAL COMPLEXITY OF ANALYZING A HOPFIELD-CLIQUE NETWORK Assaad Makki, & Pepe Siy, Wayne State University. OPTIMAL SOLUTIONS BY MODIFIED HOPFIELD NEURAL NETWORKS  From thildebr at aragorn.csee.lehigh.edu Wed Mar 18 15:53:30 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Wed, 18 Mar 92 15:53:30 -0500 Subject: Paper in neuroprose In-Reply-To: garyc@cs.uoregon.edu's message of Wed, 18 Mar 92 11:35:13 -0800 <9203181935.AA29887@sisters.cs.uoregon.edu> Message-ID: <9203182053.AA18310@aragorn.csee.lehigh.edu> Date: Wed, 18 Mar 92 11:35:13 -0800 From: garyc at cs.uoregon.edu But your measure of redundancy - collinearity - seems appropriate for your linear domain; what about redundancy for a nonlinear map? gary cottrell I think that the appropriate measure is the degree of collinearity of the training vectors in class space, i.e. after the nonlinear mapping has been performed. Obviously, this requires you to know the answer (i.e. have in hand the completely trained network) before you can measure redundancy, so the measure is not very useful. However, if you accept it as the correct definition of redundancy, then you can apply certain assumptions (e.g. local linearity of the input space, linearity in certain subspaces, etc.) which will allow you to estimate the measure a priori with varying degrees of accuracy. Thomas H. Hildebrandt CSEE Department Lehigh University  From garyc at cs.uoregon.edu Wed Mar 18 14:35:13 1992 From: garyc at cs.uoregon.edu (garyc@cs.uoregon.edu) Date: Wed, 18 Mar 92 11:35:13 -0800 Subject: Paper in neuroprose Message-ID: <9203181935.AA29887@sisters.cs.uoregon.edu> But your measure of redundancy - collinearity - seems appropriate for your linear domain; what about redundancy for a nonlinear map? gary cottrell  From berenji at ptolemy.arc.nasa.gov Thu Mar 19 20:06:11 1992 From: berenji at ptolemy.arc.nasa.gov (Hamid Berenji) Date: Thu, 19 Mar 92 17:06:11 PST Subject: FUZZ-IEEE'93 call for papers (revised) Message-ID: Please place the following call for papers on your mailing list. Thank you, Hamid R. Berenji Senior Research Scientist NASA Ames Research Center *************************** CALL FOR PAPERS SECOND IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS FUZZ-IEEE'93 San Francisco, California March 28 - April 1, 1993 In recent years, increasing attention has been devoted to fuzzy-logic approaches and to their application to the solution of real-world problems. The Second IEEE International Conference on Fuzzy Systems (FUZZ-IEEE '93) will be dedicated to the discussion of advances in: * Basic Principles and Foundations of Fuzzy Logic * Relations between Fuzzy Logic and other Approximate Reasoning Methods * Qualitative and Approximate-Reasoning Modeling * Hardware Implementations of Fuzzy-Logic Algorithms * Learning and Acquisition of Approximate Models * Relations between Fuzzy Logic and Neural Networks * Applications to * System Control * Intelligent Information Systems * Case-Based Reasoning * Decision Analysis * Signal Processing * Image Understanding * Pattern Recognition * Robotics and Automation * Intelligent Vehicle and Highway Systems This conference will be held concurrently with the 1993 IEEE International Conference on Neural Networks. Participants will be able to attend the technical events of both meetings. CONFERENCE ORGANIZATION This conference is sponsored by the IEEE Neural Networks Council, in cooperation with: International Fuzzy Systems Association North American Fuzzy Information Processing Society Japan Society for Fuzzy Theory and Systems. IEEE Systems, Man, and Cybernetics Society ELITE - European Laboratory for Intelligent Techniques Engineering The conference includes tutorials, exhibits, plenary sessions, and social events. ORGANIZING COMMITTEE GENERAL CHAIR: Enrique H.Ruspini Artificial Intelligence Center SRI International CHAIR: Piero P. Bonissone General Electric CR&D PROGRAM ADVISORY BOARD: J. Bezdek E. Sanchez E. Trillas D. Dubois Ph. Smets T. Yamakawa G. Klir M. Sugeno L.A. Zadeh H. Prade T. Terano H.J. Zimmerman FINANCE: R. Tong (Chair) R. Nutter PUBLICITY: H. Berenji (Chair) B. D'Ambrosio R. Lopez de Mantaras T. Takagi LOCAL ARRANGEMENTS: S. Ovchinnikov TUTORIALS: J. Bezdek (Chair) H. Berenji H. Watanabe EXHIBITS: A. Ralescu M. Togai L. Valverde W. Xu T. Yamakawa H.J. Zimmerman TUTORIAL INFORMATION The following tutorials have been scheduled: Introduction to Fuzzy-Set Theory, Uncertainty, and Fuzzy Logic Prof. George J. Klir, SUNY Fuzzy Logic in Databases and Information Retrieval Prof. Maria Zemankova, NSF Fuzzy Logic and Neural Networks for Pattern Recognition Prof. James C. Bezdek, Univ. of West Florida Hardware Approaches to Fuzzy-Logic Applications Prof. Hiroyuki Watanabe, Univ. North Carolina Fuzzy Logic and Neural Networks for Control Systems Dr. Hamid R. Berenji, NASA Ames Research Center Fuzzy Logic and Neural Networks for Computer Vision Prof. James Keller, Univ. of Missouri EXHIBIT INFORMATION Exhibitors are encouraged to present the latest innovations in fuzzy hardware, software, and systems based on applications of fuzzy logic. For additional information, please contact Meeting Management at Tel. (619) 453-6222, FAX (619) 535-3880. CALL FOR PAPERS In addition to the papers related to any of the above areas, the program committee cordially invites interested authors to submit papers dealing with any aspects of research and applications related to the use of fuzzy models. Papers will be carefully reviewed and only accepted papers will appear in the FUZZ-IEEE '93 Proceedings. DEADLINE FOR PAPERS: September 21, 1992 Papers must be received by September 21, 1992. Six copies of the paper must be submitted. The paper must be written in English and its length should not exceed 8 pages including figures, tables, and references. Papers must be submitted on 8-1/2" x 11" white paper with 1" margins on all four sides. They should be prepared by typewriter or letter-quality printer in one column format, single-spaced, in Times or similar type style, 10 points or larger, and printed on one side of the paper only. Please include title, author(s) name(s) and affiliation(s) on top of first page followed by an abstract. FAX submissions are not acceptable. Please send submissions prior to the deadline to: Dr. Piero P. Bonissone General Electric Corporate Research and Development Building K-1, Room 5C32A 1 River Road Schenectady, New York 12301 FOR ADDITIONAL INFORMATION REGARDING FUZZ-IEEE'93 PLEASE CONTACT: Meeting Management 5665 Oberlin Drive Suite 110 San Diego CA 92121 Tel. (619) 453-6222 FAX (619) 535-3880 -------  From guy at minster.york.ac.uk Thu Mar 19 08:13:10 1992 From: guy at minster.york.ac.uk (guy@minster.york.ac.uk) Date: 19 Mar 1992 13:13:10 GMT Subject: paper at neuroprose: Neural Networks as Components Message-ID: A paper "Neural Networks as Components", which is to be presented at the SPIE Conference on the Science of Artificial Neural Networks in Orlando, Florida in April, is available in the neuroprose archive as "smith.components.ps.Z". The paper discusses the use of neural networks as system components, and suggests research has concentrated too much on algorithms in isolation. The desirable properties of good components are listed: uniform interface, wide range of functionality and performance, and robustness. The benefits of viewing networks as plug compatible components are shown in a design example. Directions for research are suggested. The neuroprose archive is at internet address "128.146.8.52" in directory "pub/neuroprose". Limited numbers of hard copies may be available. The paper will be in conference proceedings. Happy reading, Guy Smith.  From nin at cns.brown.edu Thu Mar 19 10:59:36 1992 From: nin at cns.brown.edu (Nathan Intrator) Date: Thu, 19 Mar 92 10:59:36 EST Subject: Why batch learning is slower Message-ID: <9203191559.AA09604@cns.brown.edu> "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt From the abstract: "...For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case." ------- There is no "FIXED safe step size" for the stochastic version, namely there is no convergence proof for a fixed learning rate of the stochastic version. The paper cited by Chung-Ming Kuan and Kurt Hornik does not imply that either. It is therefore difficult to draw conclusions from this paper. - Nathan  From tenorio at ecn.purdue.edu Fri Mar 20 11:55:35 1992 From: tenorio at ecn.purdue.edu (tenorio@ecn.purdue.edu) Date: Fri, 20 Mar 1992 10:55:35 -0600 Subject: workshop invitations for PEEII Message-ID: <9203201545.AA18429@dynamo.ecn.purdue.edu> > >"NETS WORK" WORKSHOP > >Neural network research will be the theme of the Spring 1992 >Purdue Electrical Engineering Industrial Institute (PEEII) workshop >on Monday and Tuesday, April 6 and 7. > >The workshop is a regular feature of our industrial affiliates program, >and we would like representatives of your organization to be our guests >at this meeting. > >Presentations will include: > > The Parallel Distributed Processing Lab and Self-Organizing Structures > Similarity-Based Algorithms for Prediction and Control > Fusing Algorithm Responses: Multiple Networks Cooperating > on a Single Task > Applications of Feedforward Neural Networks to the Control of Dynamical > Systems > Fuzzy Neural Networks > A Neural-Network-Based Fuzzy Logic Control and Decision System > Novel Neural Network Architectures and Their Applications > Parallel, Self-Organizing, Hierachical Neural Networks with Continuous > Inputs and Outputs > Solving Constrained Optimization Problems with Artificial Neural > Networks > Learning Algorithms in Associative Memories > Neural Computing With Linear Threshold Elements > Implementation of Neural Networks on Highly-Parallel Computers >Keynote Address: >Applying Neural Networks for Process Understanding and Process Control > > >Presentation schedules and workshop registration forms are available from: > >Mary Moyars-Johnson >Manager,Industrial Relations >phone: (317) 494-3441 >e-mail: moyars at ecn.purdue.edu > >Workshop reservations must be made by March 26. > >Room reservations may be made at the Union Club (317) 494-8913 >or at local hotels/motels. > > > < Manoel Fernando Tenorio > < (tenorio at ecn.purdue.edu) or (..!pur-ee!tenorio) > < MSEE233D > < Parallel Distributed Structures Laboratory > < School of Electrical Engineering > < Purdue University > < W. Lafayette, IN, 47907 > < Phone: 317-494-3482 Fax: 317-494-6440 >  From thildebr at athos.csee.lehigh.edu Fri Mar 20 10:37:57 1992 From: thildebr at athos.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Fri, 20 Mar 92 10:37:57 -0500 Subject: Why batch learning is slower In-Reply-To: Nathan Intrator's message of Thu, 19 Mar 92 10:59:36 EST <9203191559.AA09604@cns.brown.edu> Message-ID: <9203201537.AA04951@athos.csee.lehigh.edu> Date: Thu, 19 Mar 92 10:59:36 EST From: nin at cns.brown.edu (Nathan Intrator) "Why Batch Learning is Slower Than Per-Sample Learning" Thomas H. Hildebrandt From the abstract: "...For either algorithm, convergence is guaranteed as long as no step exceeds the minimum ideal step size by more than a factor of 2. By limiting the discussion to a fixed, safe step size, we can compare the maximum step that can be taken by each algorithm in the worst case." ------- There is no "FIXED safe step size" for the stochastic version, namely there is no convergence proof for a fixed learning rate of the stochastic version. The paper cited by Chung-Ming Kuan and Kurt Hornik does not imply that either. It is therefore difficult to draw conclusions from this paper. - Nathan I have not done it, but it appears straightforward to show convergence for the linear network model with a fixed step size. The actual step taken is the product of the step size with the derivative of the error. If each step taken reduces the error in an unbiased way, then the process will converge. In this, I am not really treating a stochastic version, since in the true sense, this would make the training set an infinite sequence of random vectors. For both algorithms I assumed that there is a finite set of training vectors which can be examined repeatedly. I think this is a fairly standard assumption. It IS difficult to draw firm conclusions from this paper regarding the behavior of the two versions of BP on multilayer nonlinear networks, since the analysis is restricted to a single-layer linear network. It was intended to provide some intuition as to the unexpectedly poor performance of parallel implementations of batch BP, and to suggest an approach for the analysis of the multilayer nonlinear case. Thomas H. Hildebrandt Visiting Research Scientist CSEE Department Lehigh University  From UDAH256 at oak.cc.kcl.ac.uk Fri Mar 20 07:46:00 1992 From: UDAH256 at oak.cc.kcl.ac.uk (Mark Plumbley) Date: Fri, 20 Mar 92 12:46 GMT Subject: M.Sc. and Ph.D. Courses in NNs at King's College London Message-ID: Fellow Connectionists, Please post or forward this announcement about our M.Sc. and Ph.D. courses to anyone who might be interested. Please direct any enquiries about the courses to the postgraduate secretary (address at the end of the notice). Thanks, Mark. ------------------------------------------------------------------------- Dr. Mark D. Plumbley M.Plumbley at oak.cc.kcl.ac.uk Tel: +44 71 873 2241 Centre for Neural Networks Fax: +44 71 873 2017 Department of Mathematics/King's College London/Strand/London WC2R 2LS/UK ------------------------------------------------------------------------- CENTRE FOR NEURAL NETWORKS and DEPARTMENT OF MATHEMATICS King's College London Strand London WC2R 2LS, UK M.Sc. AND Ph.D. COURSES IN NEURAL NETWORKS --------------------------------------------------------------------- M.Sc. in INFORMATION PROCESSING and NEURAL NETWORKS --------------------------------------------------- A ONE YEAR COURSE CONTENTS Dynamical Systems Theory Fourier Analysis Biosystems Theory Advanced Neural Networks Control Theory Combinatorial Models of Computing Digital Learning Digital Signal Processing Theory of Information Processing Communications Neurobiology REQUIREMENTS First Degree in Physics, Mathematics, Computing or Engineering NOTE: For 1992/93 we have 3 SERC quota awards for this course, which must be allocated by 30th July 1992. --------------------------------------------------------------------- Ph.D. in NEURAL COMPUTING ------------------------- A 3-year Ph.D. programme in NEURAL COMPUTING is offered to applicants with a First degree in Mathematics, Computing, Physics or Engineering (others will also be considered). The first year consists of courses given under the M.Sc. in Information Processing and Neural Networks (see attached notice). Second and third year research will be supervised in one of the various programmes in the development and application of temporal, non-linear and stochastic features of neurons in visual, auditory and speech processing. There is also work in higher level category and concept formation and episodic memory storage. Analysis and simulation are used, both on PC's SUNs and main frame machines, and there is a programme on the development and use of adaptive hardware chips in VLSI for pattern and speed processing. This work is part of the activities of the Centre for Neural Networks in the School of Physical Sciences and Engineering, which has 47 researchers in Neural Networks. It is one of the main centres of the subject in the U.K. --------------------------------------------------------------------- For further information on either of these courses please contact: Postgraduate Secretary Department of Mathematics King's College London Strand London WC2R 2LS, UK  From arun at hertz.njit.edu Mon Mar 16 13:44:56 1992 From: arun at hertz.njit.edu (arun maskara spec lec cis) Date: Mon, 16 Mar 92 13:44:56 -0500 Subject: Paper available in Neuroprose Message-ID: <9203161844.AA23382@hertz.njit.edu> The following paper is now available by ftp from neuroprose archive: Forcing Simple Recurrent Neural Networks to Encode Context Arun Maskara, New Jersey Institute of Technology, Department of Computer and Information Sciences University Heights, Newark, NJ 07102, arun at hertz.njit.edu Andrew Noetzel, The William Paterson College, Department of Computer Science, Wayne, NJ 07470 Abstract The Simple Recurrent Network (SRN) is a neural network model that has been designed for the recognition of symbol sequences. It is a back-propagation network with a single hidden layer of units. The symbols of a sequence are presented one at a time at the input layer. But the activation pattern in the hidden units during the previous input symbol is also presented as an auxiliary input. In previous research, it has been shown that the SRN can be trained to behave as a finite state automaton (FSA) which accepts the valid strings corresponding to a particular grammar and rejects the invalid strings. It does this by predicting each successive symbol in the input string. However, the SRN architecture sometime fails to encode the context necessary to predict the next input symbol. This happens when two different states in the FSA generating the strings have the same output, and the SRN develops similar hidden layer encodings for these states. The failure happens more often when number of units in the hidden layer is limited. We have developed a new architecture, called the Forced Simple Recurrent Network (FSRN), that solves this problem. This architecture contains additional output units, which are trained to show the current input and the previous context. Simulation results show that for certain classes of FSA with $u$ states, the SRN with $\lceil \log_2u \rceil$ units in the hidden layers fails, where as the FSRN with the same number of hidden layer units succeeds. ------------------------------------------------------------------------------- Copy of the postscript file has been placed in neuroprose archive. The file name is maskara.fsrn.ps.Z The usual instructions can be followed to obtain the file from the directory pub/neuroprose from the ftp site archive.cis.ohio-state.edu Arun Maskara  From ecai92 at ai.univie.ac.at Mon Mar 23 06:23:10 1992 From: ecai92 at ai.univie.ac.at (ECAI92 Vienna Conference Service) Date: Mon, 23 Mar 1992 12:23:10 +0100 Subject: ECAI92 Advance Information Message-ID: <199203231123.AA11857@dublin.ai.univie.ac.at> ======================================================================= Advance Information - ECAI92 - Advance Information - ECAI92 - VIENNA ======================================================================= 10th European Conference on Artificial Intelligence (ECAI 92) August 3-7, 1992, Vienna, Austria Programme Chairperson Bernd Neumann, University of Hamburg, Germany Local Arrangements Chairperson Werner Horn, Austrian Research Institute for AI, Vienna The European Conference on Artificial Intelligence (ECAI) is the European forum for scientific exchange and presentation of AI research. The aim of the conference is to cover all aspects of AI research and to bring together basic research and applied research. The Technical Programme will include paper presentations, invited talks, survey sessions, workshops, and tutorials. The conference is designed to cover all subfields of AI, including non-symbolic methods. ECAIs are held in alternate years and are organized by the European Coordinating Committee for Artificial Intelligence (ECCAI). The 10th ECAI in 1992 will be hosted by the Austrian Society for Artificial Intelligence (OGAI). The conference will take place at the Vienna University of Economics and Business Administration. PROGRAMME STRUCTURE Mon-Tue (Aug 3-4): Tutorials and Workshops Wed-Fri (Aug 5-7): Invited Talks, Paper Presentations, Survey Sessions Tue-Fri (Aug 4-7): Industrial Exhibition ======================== INVITED LECTURES ============================== Stanley J.Rosenschein (Teleos Research, Palo Alto, Calif., USA): Perception and Action in Autonomous Systems Oliviero Stock (IRST, Trento, Italy): A Third Modality of Natural Language? Promising Trends in Applied Natural Language Processing Peter Struss (Siemens AG, Muenchen, Germany): Knowledge-Based Diagnosis - An Important Challenge and Touchstone for AI =================== TECHNICAL PAPERS PROGRAMME ========================= This will consist of papers selected from the 680 that were submitted. These papers will be given in parallel sessions held from August 5 to 7, 1992. The topics of the papers include: - Automated Reasoning - Cognitive Modeling - Connectionist and PDP Models for AI - Distributed AI and Multiagent Systems - Enabling Technology and Systems - Integrated Systems - Knowledge Representation - Machine Learning - Natural Language - Philosophical Foundations - Planning, Scheduling, and Reasoning about Actions - Principles of AI Applications - Reasoning about Physical Systems - Robotics - Social, Economic, Legal, and Artistic Implications - User Interfaces - Verification, Validation & Test of Knowledge-Based Systems - Vision and Signal Understanding ============================ TUTORIALS ================================= --- Tutorials ----- Mon, August 3, 9:00-13:00 Applied Qualitative Reasoning Robert Milne, Intelligent Applications Ltd, Scotland, and Louise Trave-Massuyes, LAAS, Toulouse, France In Search of a New Planning Paradigm - Steps Beyond Classical Planning Joachim Hertzberg, GMD, Germany, and Sam Steel, Essex University, Cholchester, UK Machine Learning: Reality and Perspectives Lorenza Saitta, Universita di Torino, Italy --- Tutorials ----- Mon, August 3, 14:00-18:00 AI in Service and Support Anil Rewari, Digital Equipment Corp., Marlboro, Mass. Case-Based Reasoning Katia P. Sycara, Carnegie Mellon University, Pittsburgh, Penn. Computer Vision, Seeing Systems, and Their Applications Jan-Olof Eklundh, Royal Institute of Technology, Stockholm, Sweden Nonmonotonic Reasoning Gerhard Brewka, ICSI, Berkeley, Calif., and Kurt Konolige, SRI, Menlo Park, Calif. --- Tutorials ----- Tue, August 4, 9:00-13:00 Distributed AI Frank von Martial, Bonn, and Donald Steiner, Siemens AG, Germany Fuzzy Set-Based Methods for Inference and Control Henri Prade, IRIT, Universite Paul Sabatier, Toulouse, France Validation of Knowledge-Based Systems Jean-Pierre Laurent, Universite de Savoie, Chambery, France --- Tutorials ----- Tue, August 4, 14:00-18:00 Current Trends in Language Technology Harald Trost, Austrian Research Institute for AI and University of Vienna, Austria KADS: Practical, Structured KBS Development Robert Martil, Lloyd's Register of Shipping, Croydon, UK, and Bob Wielinga, University of Amsterdam, The Netherlands Neural Networks: From Theory to Applications Francoise Fogelman Soulie, Mimetics, France User Modeling and User-Adapted Interaction Sandra Carberry, University of Delaware, Newark, Delaware, and Alfred Kobsa, University of Konstanz, Germany ============================ WORKSHOPS ================================= Workshops are part of the ECAI92 scientific programme. They will give participants the opportunity to discuss specific technical topics in a small, informal environment, which encourages interaction and exchange of ideas. Persons interested in attending a workshop should contact the workshop organizer (addresses below), and the conference office (ADV) for ECAI92 registration. Note that all workshops require an early application for participation. A full description of all workshops can be obtained by sending an email to ecai92.ws at ai.univie.ac.at, which will automatically respond. --- Workshops ----- Mon, August 3 Art and AI: Art / ificial Intelligence Robert Trappl, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: robert at ai.univie.ac.at Coping with Linguistic Ambiguity in Typed Feature Formalisms Harald Trost, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: harald at ai.univie.ac.at Formal Specification Methods for Complex Reasoning Systems Jan Treur, AI Group, Dept.of Mathematics and Computer Science, Vrije Universiteit Amsterdam, De Boelelaan 108-1a, NL-1081 HV Amsterdam, The Netherlands; Fax: +31-29-6427705, Email: treur at cs.vu.nl Knowledge Sharing and Reuse: Ways and Means Nicolaas J.I. Mars, Dept.of Computer Science, University of Twente, PO Box 217, NL-7500 AE Enschede, The Netherlands; Fax: +31-53-339605, Email: mars at cs.utwente.nl Model-Based Reasoning Gerhard Friedrich, Franz Lackinger, Dept.Information Systems, CD-Lab for Expert Systems, Univ.of Technology, Paniglg.16, A-1040 Vienna; Fax: +43-1-5055304, Email: friedrich at vexpert.dbai.tuwien.ac.at Neural Networks and a New AI Georg Dorffner, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: georg at ai.univie.ac.at Scheduling of Production Processes Juergen Dorn, CD-Laboratory for Expert Systems, University of Technology, Paniglgasse 16, A-1040 Vienna, Austria; Fax: +43-1-5055304; Email: dorn at vexpert.dbai.tuwien.ac.at Validation, Verification and Test of KBS Marc Ayel, LIA, University of Savoie, BP.1104, F-73011 Chambery, France; Fax: +33-79-963475, Email: ayel at frgren81.bitnet --- Workshops ----- Tue, August 4 Advances in Real-Time Expert System Technologies Wolfgang Nejdl, Department for Information Systems, CD-Lab for Expert Systems, University of Technology, Paniglgasse 16, A-1040 Vienna, Austria; Fax: +43-1-5055304, Email: nejdl at vexpert.dbai.tuwien.ac.at Application Aspects of Distributed Artificial Intelligence Thies Wittig, Atlas Elektronik GmbH, Abt.TEF, Sebaldsbruecker Heerstrasse 235, D-W-2800 Bremen 44, Germany; Fax: +49-421-4573756, Email: t_wittig at eurokom.ie Applications of Reason Maintenance Systems Francois Charpillet, Jean-Paul Haton, CRIN/INRIA-Lorraine, B.P. 239, F-54506 Vandoeuvre-Les-Nancy Cedex, France; Fax: +33-93-413079, Email: charp at loria.crin.fr Artificial Intelligence and Music Gerhard Widmer, Austrian Research Institute for Artificial Intelli- gence, Schottengasse 3, A-1010 Vienna, Austria; Fax: +43-1-630652, Email: gerhard at ai.univie.ac.at Beyond Sequential Planning Gerd Grosse, FG Intellektik, TH Darmstadt, Alexanderstr.10, D-6100 Darmstadt, Germany; Fax: +49-6151-165326, Email: grosse at intellektik.informatik.th-darmstadt.de Concurrent Engineering: Requirements for Knowledge-Based Design Support Nel Wognum, Dept. of Computer Science, University of Twente, P.O.Box 217, NL-7500 AE Enschede, The Netherlands; Fax: +31-53-339605, Email: wognum at cs.utwente.nl Improving the Use of Knowledge-Based Systems with Explanations Patrick Brezillon, CNRS-LAFORIA, Box 169, University of Paris VI, 2 Place Jussieu, F-75252 Paris Cedex 05, France; Fax: +33-1-44277000, Email: brezil at laforia.ibp.fr The Theoretical Foundations of Knowledge Representation and Reasoning Gerhard Lakemeyer, Institut f.Informatik III, Universitaet Bonn, Roemerstr.164, D-W-5300 Bonn 1, Germany; Fax: +49-228-550382, Email: gerhard at uran.informatik.uni-bonn.de --- Workshops ----- Mon and Tue, August 3-4 Expert Judgement, Human Error, and Intelligent Systems Barry Silverman, Institute for AI, George Washington University, 2021 K St. NW, Suite 710, Washington, DC 20006, USA; Fax: (202)785-3382, Email: barry at gwusun.gwu.edu Logical Approaches to Machine Learning Celine Rouveirol, Universite Paris-Sud, LRI, Bat 490, F-91405 Orsay, France; Fax: +33-1-69416586, Email: celine at lri.lri.fr Spatial Concepts: Connecting Cognitive Theories with Formal Representations Simone Pribbenow, Email: pribbeno at informatik.uni-hamburg.de, and Christoph Schlieder, Institut f.Informatik und Gesellschaft, Friedrichstr.50, D-7800 Freiburg, Germany; Fax: +49-761-2034653, Email: cs at cognition.iig.uni-freiburg.de ======================== GENERAL INFORMATION =========================== DELEGATE'S FEE (in Austrian Schillings, approx. 14 AS = 1 ECU, 12 AS = 1 US$) early late on-site (rec.before) (Jun 1) (Jul 15) Members of ECCAI member organizations 4.500,- 5.000,- 6.000,- Non-Members 5.000,- 6.000,- 7.000,- Students 1.500,- 2.000,- 2.500,- The delegate's fee covers attendance at the scientific programme (invited talks, paper presentations, survey sessions, and workshops), conference documentation including the conference proceedings, admission to the industrial exhibition, and participation in selected evening events. TUTORIAL FEE (per tutorial) early late on-site (rec.before) (Jun 1) (Jul 15) Members of ECCAI member organizations 3.000,- 3.500,- 4.000,- Non-Members 3.500,- 4.000,- 4.500,- Students 1.500,- 2.000,- 2.500,- Tutorial Registration entitles to admission to that tutorial, admission to the exhibition, a copy of the course material, and refreshments during the tutorial. ACCOMODATION Hotels of different price categories, ranging from DeLuxe to the very cheap student hostel (available for non-students too), are available for the first week of August. The price ranges (in AS) are given below. Hotel Category single room double room with bath without bath with bath without bath DeLuxe ***** 1690,-/2375,- 2400,-/3200,- A **** 990,-/1300,- 1400,-/1790,- B *** 750,-/980,- 1100,-/1350,- Season Hotel 480,-/660,- 335,-/450,- 780,-/900,- 580,-/730,- Student Hostel 220,- 380,- The conference venue is located in a central district of Vienna. It can be reached easily by public transport. ============================ REGISTRATION ============================== For detailed information and registration material please contact the conference office: ADV c/o ECAI92 Trattnerhof 2 A-1010 Vienna, Austria Tel: +43-1-5330913-74, Fax: +43-1-5330913-77, Telex: 75311178 adv a or send your postal address via email to: ecai92 at ai.univie.ac.at  From mcolthea at laurel.ocs.mq.edu.au Mon Mar 23 17:24:19 1992 From: mcolthea at laurel.ocs.mq.edu.au (Max Coltheart) Date: Tue, 24 Mar 92 08:24:19 +1000 Subject: No subject Message-ID: <9203232224.AA12889@laurel.ocs.mq.edu.au> Models Of Reading Aloud: Dual-Route And Parallel-Distributed-Processing Approaches Max Coltheart, Brent Curtis and Paul Atkins School of Behavioural Sciences Macquarie University Sydney NSW 2109 Australia email: max at currawong.mqcc.mq.oz.au Submitted for publication March 23, 1992. Abstract It has often been argued that various facts about skilled reading aloud cannot be explained by any model unless that model possesses a dual-route architecture: one route from print to speech that may be described as lexical (in the sense that it operates by retrieving pronunciations from a mental lexicon) and another route from print to speech that may be described as non-lexical (in the sense that it computes pronunciations by rule, rather than by retrieving them from a lexicon). This broad claim has been challenged by Seidenberg and McClelland (1989, 1990). Their model has but a single route from print to speech, yet, they contend, it can account for major facts about reading which have hitherto been claimed to require a dual-route architecture. We identify six of these major facts about reading. The one-route model proposed by Seidenberg and McClelland can account for the first of these, but not the remaining five: how people read nonwords aloud, how they perform visual lexical decision, how two particular forms of acquired dyslexia can arise, and how different patterns of developmental dyslexia can arise. Since models with dual-route architectures can explain all six of these basic facts about reading, we suggest that this remains the viable architecture for any tenable model of skilled reading and learning to read. Preprints available from MC at the above address.  From terry at jeeves.UCSD.EDU Wed Mar 25 15:14:36 1992 From: terry at jeeves.UCSD.EDU (Terry Sejnowski) Date: Wed, 25 Mar 92 12:14:36 PST Subject: Neural Computation 4:2 Message-ID: <9203252014.AA02301@jeeves.UCSD.EDU> Neural Computation Volume 4, Issue 2, March 1992 Review First and Second-Order Methods for Learning: Steepest Descent and Newton's Method Roberto Battiti Article Efficient Simplex-Like Methods for Equilibria of Nonsymmetric Analog Networks Douglas A. Miller and Steven W. Zucker Note A Volatility Measure for Annealing in Feedback Neural Networks Joshua Alspector, Torsten Zeppenfeld and Stephan Luna Letters What Does the Retina Know about Natural Scenes? Joseph J. Atick and A. Norman Redlich A Simple Network Showing Burst Synchronization without Frequency-Locking Christof Koch and Heinz Schuster On a Magnitude Preserving Iterative MAXnet Algorithm Bruce W. Suter and Matthew Kabrisky A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks Jurgen Schmidhuber Learning Complex, Extended Sequences Using the Principle of History Compression Jurgen Schmidhuber How Tight are the Vapnik-Chervonenkis Bounds? David Cohn and Gerald Tesauro Working Memory Networks for Learning Temporal Order with Application to 3-D Visual Object Recognition Gary Bradski, Gail A. Carpenter, and Stephen Grossberg ----- SUBSCRIPTIONS - VOLUME 4 - BIMONTHLY (6 issues) ______ $40 Student ______ $65 Individual ______ $150 Institution Add $12 for postage and handling outside USA (+7% for Canada). (Back issues from Volumes 1-3 are regularly available for $28 each.) ***** Special Offer -- Back Issues for $17 each ***** MIT Press Journals, 55 Hayward Street, Cambridge, MA 02142. (617) 253-2889. -----  From bap at james.psych.yale.edu Wed Mar 25 12:09:56 1992 From: bap at james.psych.yale.edu (Barak Pearlmutter) Date: Wed, 25 Mar 92 12:09:56 -0500 Subject: Why batch learning is slower In-Reply-To: "Thomas H. Hildebrandt "'s message of Fri, 20 Mar 92 10:37:57 -0500 <9203201537.AA04951@athos.csee.lehigh.edu> Message-ID: <9203251709.AA19744@james.psych.yale.edu> For a quadratic form minimum, in the batch case, without momentum, it is well known that a stepsize eta<2/lmax (where lmax = max eigenvalue of Hessian) gives convergence to the minimum. However, this is not true in the online or "stochastic gradient descent" case. In that case, a fixed stepsize leads to convergence to a neighborhood of the minimum, where the size of the neighborhood is determined by the stepsize, since it amplifies the noise in the noisy samples of the gradient. For this reason, it is necessary for the stepsize to go to zero in the limit in the online case. In fact, it can be shown that $\sum_t \eta(t)$ must go to infinity to prevent convergence to a non-minimum, while $\sum_t \eta(t)^2$ must not go to infinity, to guarantee convergence. If these two conditions are satisfied, convergence to a minimum is achieved with probability 1. Of course, you are studing the "cyclic online presentation" case, which, although it fulfills the conditions of stochastic gradient, also has some additional structure. However, it is easy to convince yourself that this case does not permit a fixed step size. Consider a 1-d system, with two patterns, one of which gives $E_1 = (w-1)^2$ and the other $E_2 = (w+1)^2.$ Notice that $E = E_1 + E_2$ has a minimum at $w=0$. But with a step size that does not go to zero, $w$ will flip back and forth forever. --Barak Pearlmutter.  From p-mehra at uiuc.edu Thu Mar 26 01:17:36 1992 From: p-mehra at uiuc.edu (Pankaj Mehra) Date: Thu, 26 Mar 92 00:17:36 CST Subject: Debate on recent paper by Amari Message-ID: <9203260617.AA20244@rhea> Fellow Connectionists: I have recently come across some thought-provoking debate between Dr. Andras Pellionisz and Prof. Shun-Ichi Amari regarding Prof. Amari's recent paper in Neural Networks. (Refer to an earlier message from me 09/11/91 "Exploiting duality to analyze ANNs") The paper in question is: Shun-Ichi Amari, ``Dualistic Geometry of the Manifold of Higher-Order Neurons,'' Neural Networks, vol. 4, pp. 443-451, 1991. At the risk of irritating some of you, I am reproducing excerpts of this debate from recent issues of Neuron Digest, a moderated discussion forum. I urge you to (i) continue the debate on that forum instead of swamping connectionists' mailing list with responses; and, (ii) refrain from sending me e-mail responses. I understand that long messages to the entire list are discouraged but feel compelled to post this message because of its obvious relevance to all of us. Following are the excerpts of recent debate gleaned from Neuron Digest, Vol. 9, Issues 8-10,14. Contact moderator for subscription, submissions. Old issues are available via FTP from cattell.psych.upenn.edu (128.91.2.173). In what follows, "Editor's Notes" refer to the moderator's comments, not mine. -Pankaj Mehra University of Illinois ____________________________________________________________________________ Neuron Digest Saturday, 22 Feb 1992 Volume 9 : Issue 8 Today's Topics: Open letter to Dr. Sun-Ichi Amari ------------------------------ From hirsch at math.berkeley.edu Mon Mar 2 20:23:03 1992 From: hirsch at math.berkeley.edu (Morris W. Hirsch) Date: Mon, 02 Mar 92 17:23:03 -0800 Subject: Reply to Pellionisz' "Open Letter" Message-ID: DEar Michael [Arbib]: Bravo for your reply to Pellionisz, and for sending it to Neuron Digest! I was going to reply to ND along similar but less knowledgeable lines-- I may still do so if the so-called dispute continues. YOurs, --MOE Professor Morris W. Hirsch Department of Mathematics University of California Berkeley, CA 94720 USA e-mail: hirsch at math.berkeley.edu Phones: NEW area code 510: Office, 642-4318 Messages, 642-5026, 642-6550 Fax: 510-642-6726 ------------------------------ From Konrad.Weigl at sophia.inria.fr Wed Mar 4 06:24:22 1992 From: Konrad.Weigl at sophia.inria.fr (Konrad Weigl) Date: Wed, 04 Mar 92 12:24:22 +0100 Subject: Arbib's response to "open letter to Amari" Message-ID: Ref. Dr. Arbib's answer in Volume Digest V9 #9 I am not familiar enough with the work of Pellionisz, or proficient enough in the mathematics of General Spaces to judge upon the mathematical rigour of his work; however, whatever that rigour, as far as I know, he was the first to link the concept of Neural Networks with non-euclidian Geometry at all. That he did so to analyze the dynamics of biological Neural networks, and Dr. Amari used non-euclidian Geometry later to give a metric to spaces of different types of Neural Networks, and a geometric interpretation of learning, does not change one iota of that fact above. This is not to denigrate Dr. Amari's contribution to the field, of course. Konrad Weigl Tel. (France) 93 65 78 63 Projet Pastis Fax (France) 93 65 76 43 INRIA-Sophia Antipolis email Weigl at sophia.inria.fr 2004 Route des Lucioles B.P. 109 06561 Valbonne Cedex France ------------------------------ From ishimaru at hamamatsu-pc.ac.jp Sat Mar 7 07:58:59 1992 From: ishimaru at hamamatsu-pc.ac.jp (Kiyoto Ishimaru) Date: Sat, 07 Mar 92 14:58:59 +0200 Subject: Pellionisz' "Open Letter" Message-ID: Dear Moderator: The recent response of Dr. Arbib brought my attention. The following is my opinion about the issue: 1) Citing or not-citing in a paper should not be decided based on KINDNESS to earlier related researches, but rather KINDNESS to those who are supposed to read or to come across the prospective paper. 2) Similarity or dissimilarity argument, based on the "subjective" Riemannian space, lasts forever without any positive results. The most important thing is who was the first person having brought the tensor analysis "technique" into NN field. This should be discussed. 3) Political or social issue, such as Japan-bashing and fierce competition in R&D World-wide, should not be taken into account on this issue. This kind of discussion style does not bring any fruitful results, but makes the issue more complicated and rather worse, and intangible. 4) Dr Amari's comments in his letter, quoted by Dr. Pellionisz, "Indeed, when I wrote that paper, I thought to refer to your paper", and " But if I did so, I could only state that it is nothing to do with the geometrical appraoch that I initiated" may result in the conclusion: If Dr. Pellionisz' paper were nothing to do with Dr. Amari's, citing Dr. Pellionisz' would not have come across in his mind at all(but he actually did think over). Direct public comments from Dr. Amari is urged on this respect and others in order for both of them to get a fair jugement. Please be sure that I made the comment with an awkward feeling due to lack of deep background with respect to the tensor analysis effect on NN field. However, I am pleased with expressing my self on this e-mail. Thank you. Sincerely Yours Kiyoto Ishimaru Dept of Computer Science Hamamatsu Polytechnic College 643 Norieda Hamamatsu 432 Japan e-mail: ishimaru at hamamatsu-pc.ac.jp ------------------------------ From amari at sat.t.u-tokyo.ac.jp Tue Mar 10 11:16:32 1992 From: amari at sat.t.u-tokyo.ac.jp (Shun-ichi Amari) Date: Tue, 10 Mar 92 18:16:32 +0200 Subject: reply to the open letter to Amari Message-ID: [[ Editor's Note: In a personal note, I thanked Dr. Amari for his response. I had assumed, incorrectly, that Dr. Pellionisz had sent a copy to Dr. Amari who is not a Neuron Digest subscriber. I'm sure all readers will remember that Neuron Digest is not a peer-referreed journal but an informal forum for electronic communication. I hope the debate can come to a fruitful conclusion -PM ]] Dear Editor : Professor Usui at Toyohashi Institute of Technology and Science kindly let me know that there is an "open letters to Amari" in Neuron Digest. I was so surprised that an open letter to me was published without sending it to me. Moreover, the letter requires me to answer repeatedly what I have already answered to Dr. Pellionisz. I again try to repeat my answer in more detail. Reply to Dr. Pellionisz by Shun-ichi Amari 1. Dr. Pellionisz accused me that I have two contradictory opinions : 1) My work is a generalization of his and 2) my approach is nothing to do with his. This is incorrect. Once one reads my paper ("Dualistic geometry of the manifold of higher-order neurons", Neural Networks, vol. 4 (1991), pp. 443-451; see also another paper "Information geometry of Boltzmann machines" by S. Amari, K. Kurata and H. Nagaoka, IEEE Trans. on Neural Networks, March 1992), it is immediately clear 1) that my work is never a generalization of his and 2) more strongly that it has nothing to do with Pellionisz' work. Dr. Pellionisz seems accusing me without reading or understanding my paper at all. I would like to ask the readers to read my paper. For those readers who have not yet read my paper, I would like to compare his work with mine in the following, because this is what Dr. Pellionisz has carefully avoided. 2. We can summarize his work in that the main function of the cerebellum is a transformation of a covariant vector to a contravariant vector in a metric Euclidean space since non-orthogonal reference bases are used in the brain. He mentioned verbally non-linear generalizations and so on, but nothing scientific has been done along this line. 3. In my 1991 paper, I proposed a geometrical theory of the manifold of parameterized non-linear systems, with special reference to the manifold of non-linear higher order neurons. I did not focus on any functions of a neural network but the mutual relations among different neural networks such as the distance of two different neural networks, the curvature of a family of neural networks and its role, etc. Here the method of information geometry plays a fundamental role. It uses a dual pair of affine connections, which is a new concept in differential geometry, and has been proved to be very useful for analyzing statistical inference problems, multiterminal information theory, the manifold of linear control systems, and so on (see S.Amari, Differential Geometrical Methods of Statistics, Springer Lecture Notes in Statistics, vol.28, 1985 and many papers referred to in its second printing). Now a number of mathematicians are studying on this new subject. I have shown that the same method of information geometry is applicable to the manifold of neural networks, elucidating the capabilities and limitations of a family of neural networks in terms of their architecture. I have opened, I believe, a new fertile field of studying, not the behaviors of single neural networks, but the collective properties of the set or the manifold of neural networks in terms of new differential geometry. 4. Now we can discuss the point. Is my theory a generalization of his theory? Definitely No. If A is a generalization of B, A should include B as a special example. My theory does never include any of his tensorial transformations. A network is merely a point of the manifold in my theory. I have studied a collective behaviors of the manifold but have not studied properties of points. 5. The second point. One may ask that, even if my theory is not a generalization of his theory, it might have something to do with his theory so that I should have referred to his work. The answer is again no. Dr. Pellionisz insists that he is a pioneer of tensor theory and my theory is also tensorial. This is not true. My theory is differential-geometrical, but it does not require any tensorial notation. Modern differential geometry has been constructed without using tensorial notations, although it is sometimes convenient to use them. As one sees from my paper, its essential part is described without tensor notations. In differential geometry, what is important is intrinsic structures of manifolds such as affine connections, parallel transports, curvatures, and so on. The Pellionisz theory has nothing to do with these differential-geometrical concepts. He used the tensorial notation to point out that the role of the cerebellum is a special type of linear transformations, namely a covariant-contravariant linear transformation, which M.A.Arbib and myself have criticized. 6. Dr. Pellionisz claims that he is the pioneer of the tensorial theory of neural networks. Whenever one uses a tensor, should he refer to Pellionisz'? This is rediculus. Who does claim that he is the pioneer of using differential equations, linear algebra, probability theory, etc. in neural network theory? It is just a commonly used method. Moreover, the tensorial method itself had been used since the old time in neural network research. For example, in my 1967 paper (S. Amari, A mathematical theory of adaptive pattern classifiers, IEEE Trans. on EC, vol.16, pp.299-307) where I proposed the general stochastic gradient learning method for multilayer networks, I used the metric tensor C (p.301) in order to transform a covariant gradient vector to the corresponding contravariant learning vector. But I suppressed the tensor notation there. However, in p.303, I explicitly used the tensorial notation in order to analyze the dynamic behavior of modifiable parameter vectors. I never claim that Dr. Pellionisz should refer to this old paper, because the tensorial method itself is of common use to all applied mathematicians, and my old theory is nothing to do with his except that both used a covariant-contravariant transformation and tensorial notations, a common mathematical concept. 7. I do not like non-productive, time-consuming and non-scientific discussions like this. If one reads my paper, everything will be melted away. This is nothing to do with the fact that I am unfortunately a coeditor-in-chief of Neural Networks, a threaten on the intellectual properties (of tensorial theory), the world-wide competition of scientific research, etc. which Dr. Pellionisz hinted as if such were in the background. Instead, this reminded me of the horrible days when Professor M. A. Arbib and myself were preparing the righteous criticism to his theory (not a criticism to using the tensor concept but to his theory itself). I had received astonishing interference repeatedly, which I hope would never happen again. I will disclose my e-mail letter to Pellionisz in the following, hoping that he discloses his first letter including his unbelievable request, because it makes the situation and his desire clear. The reader would understand why I do not want to continue fruitless discussions with him. I also request him to read my paper and to point out which concepts or theories in my paper are generalizations of his. The folowing is my old reply to Dr.Pellionisz which he partly referred to in his "open letter to Amari". Dear Dr. Pellionisz: Thank you for your e-mail remarking my recent paper entitled "Dualistic geometry of the manifold of higher-order neurons". As you know very well, we criticized your idea of tensorial approach in our memorial joint paper with M.Arbib. The point is that, although the tensorial approach is welcome, it is too restrictive to think that the brain function is merely a transformation between contravarian vectors and covariant vectors; even if we use linear approximations, the transformation should be free of the positivity and symmetry. As you may understand these two are the essential restrictions of covariant-contravariant transformations. You have interests in analyzing a general but single neural network. Of course this is very important. However, what I am interested in is to know a geometrical structures of a set of neural networks (in other words, a set of brains). This is a new object of research. Of course, I did some work along this line in statistical neurodynamics where a probability measure is introduced in a manifold of neural networks, and physicists later have followed a similar idea (E.Gardner and others). However, a geometrical structure is implicit. As you noted, I have written that my paper opens a new fertile field of neural network research, in the following two senses: First, that we are treating a set of networks, not the behavior of a single network. There are vast number of researches on single networks by analytical, stochastic, tensorial and many other mathematical methods. The point is to treat a new object of research, a manifold of neural networks. Secondly, I have proposed a new concept of dual affine connections, which mathematician have recently been studying in more detail as mathematical research. So if you have studied the differential geometrical structure of a manifold of neural networks, I should refer to it. If you have proposed a new concept of duality in affine connections, I should refer to it. If you are claiming that you used tensor analysis in analyzing behaviors of single neural networks, it is nothing to do with the field which I have opened. Indeed, when I wrote that paper, I thought to refer to your paper. But if I did so, I could only state that it is nothing to do with this new approach. Moreover, I need to repeat our memorial criticism again. I do not want to do such irrelevant discussions. If you read my paper, I think you understand what is newly opened by this approach. Since our righteous criticism to your memorable approach has been published, we do not need to repeat it again and again. I do hope your misunderstanding is resolved by this mail and by reading my paper. Sincerely yours, Shun-ichi Amari ------------------------------ Neuron Digest Wednesday, 25 Mar 1992 Volume 9 : Issue 14 Today's Topics: Is it time to change our referring? ("Open Letter" debate) ------------------------------ From Miklos.Boda at eua.ericsson.se Wed Mar 25 10:50:35 1992 From: Miklos.Boda at eua.ericsson.se (Miklos.Boda@eua.ericsson.se) Date: Wed, 25 Mar 92 16:50:35 +0100 Subject: Is it time to change our referring? ("Open Letter" debate) Message-ID: [[ Editor's Note: Another contribution to the ongoing discussion prompted by Pellionisz' "Open Letter" from several issues ago. I think the following deals with some of the larger issues of citation which are important to consider. -PM ]] Dear Moderator: IS IT TIME TO CHANGE OUR REFERRING ? My contribution to the issue : "referring or not" initiated by "Open Letter" of Dr. Pellionisz: 1. It is obvious that Dr. Pellionisz introduced a brilliant concept when he brought the tensor analysis approach into Neural Network research. Despite any imperfection that may exist in a new approach to the mathematics of General Neural Spaces, his theory has already been used by several followers and its influence is undeniable. Dr. Amari certainly has the right of claiming that his differential- geometrical approach has nothing to do with earlier comparable (tensor) approaches. However, readers would be left uncertain, if an oversight occured and questions would be put to Dr. Amari how he compares his approach to Pellionisz'. 2. The issue of referring or not, is unfortunately, a classic problem (mostly between former colleagues who have a grudge against each other, or between international competitors; see similar debate in AIDS research recently.) The problem could be solved, only if we start openly talking about this serious issue, even if it is sometimes felt inconvenient to do so. (Thus it was a good and a brave move of Dr. Pellionisz that he brought the subject up). Why do we use reference lists at all? a. First, we must list all titles which we were really using. b. Second, by tradition, we are helping the reader to give them some basic references for a better general understanding. c. Third, we establish the claims of our paper over comparable approaches. I.e. claiming the novelty of ideas, that only superficially seem related to other approaches.. I think Dr. Amari may have had point a. in mind, whereas Dr. Pellionisz may consider at least points a. and c. points important, those who wish to be kind to the reader would also consider point b. 4. Maybe it is time now to change our habits, and adapt to the new computerized literature-search, when we can find "comparables" by looking for keywords. Declaring proper keywords could therefore replace references (anyone who searches in the literature of neural networks for "tensor" will get his hands full of Pellionisz' papers). Using such method one could restrict citation to those items that one actually uses. This new method, so far, is not universally accepted, and would not state the authors claims over comparable approaches. No search for "AIDS virus" would settle claims who pioneered an approach, and the claims themselves must originate from authors. 5. More over etiquette of debate: I'll hope, Dr. Arbib already regret his precipitate remarks. (ND v9#9). Miklos Boda Ellemtel, Telecommunication Systems Laboratories Box 1505 125 25 Alvsjo Sweden ------------------------------  From pagre at weber.UCSD.EDU Thu Mar 26 18:58:07 1992 From: pagre at weber.UCSD.EDU (Phil Agre) Date: Thu, 26 Mar 92 15:58:07 -0800 Subject: AI Journal Message-ID: <9203262358.AA21526@weber> Stan Rosenschein and I are editing a special issue of the AI Journal on "Computational Theories of Interaction and Agency". We started from the observation that a wide variety of people in AI and cognitive science are using principled characterizations of interactions between agents and their environments to guide their theorizing and designing and modeling. Some connectionist projects I've heard about fit this description as well, and people engaged in such projects would be most welcome to contribute articles to our special issue. I've enclosed the call for papers. Please feel free to pass it along to anyone who might be interested. And I can send further details to anyone who's curious. Thanks very much. Phil Agre, UCSD Artificial Intelligence: An International Journal Special Issue on Computational Theories of Interaction and Agency Edited by Philip E. Agre (UC San Diego) and Stanley J. Rosenschein (Teleos Research) Call for Papers Recent computational research has greatly deepened our understanding of agents' interactions with their environments. The first round of research in this area developed `situated' and `reactive' architectures that interact with their environments in a flexible way. These `environments', however, were characterized in very general terms, and often purely negatively, as `uncertain', `unpredictable', and the like. In the newer round of research, psychologists and engineers are using sophisticated characterizations of agent-environment interactions to motivate explanatory theories and design rationales. This research opens up a wide variety of new issues for computational research. But more fundamentally, it also suggests a revised conception of computation itself as something that happens in an agent's involvements in its world, and not just in the abstractions of its thought. The purpose of this special issue of Artificial Intelligence is to draw together the remarkable variety of computational research that has recently been developing along these lines. These include: * Task-level robot sensing and action strategies, as well as projects that integrate classical robot dynamics with symbolic reasoning. * Automata-theoretic formalizations of agent-environment interactions. * Studies of "active vision" and related projects that approach perception within the broader context of situated activity. * Theories of the social conventions and dynamics that support activity. * Foundational analyses of situated computation. * Models of learning that detect regularities in the interactions between an agent and its environment. This list is only representative and could easily be extended to include further topics in robotics, agent architectures, artificial life, reactive planning, distributed AI, human-computer interaction, cognitive science, and other areas. What unifies these seemingly disparate research projects is their emerging awareness that the explanation and design of agents depends on principled characterizations of the interactions between those agents and their environments. We hope that this special issue of the AI Journal will clarify trends in this new research and take a first step towards a synthesis. The articles in the special issue will probably also be reprinted in a book to be published by MIT Press. The deadline for submitted articles is 1 September 1992. Send articles to: Philip E. Agre Department of Communication D-003 University of California, San Diego La Jolla, California 92093-0503 Queries about the special issue may be sent to the above address or to pagre at weber.ucsd.edu. Prospective contributors are encouraged to contact the editors well before the deadline.  From hinton at ai.toronto.edu Fri Mar 27 10:20:09 1992 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Fri, 27 Mar 1992 10:20:09 -0500 Subject: attempted character assasination Message-ID: <92Mar27.102016edt.319@neuron.ai.toronto.edu> Amari has made profound and original contributions to the neural network field and I think people should read and understand his papers before they clutter up our mailing list with the wild accusations on neuron-digest. Geoff  From thildebr at aragorn.csee.lehigh.edu Sun Mar 29 10:07:02 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Sun, 29 Mar 92 10:07:02 -0500 Subject: Why batch learning is slower In-Reply-To: Barak Pearlmutter's message of Wed, 25 Mar 92 12:09:56 -0500 <9203251709.AA19744@james.psych.yale.edu> Message-ID: <9203291507.AA09353@aragorn.csee.lehigh.edu> Date: Wed, 25 Mar 92 12:09:56 -0500 From: Barak Pearlmutter Reply-To: Pearlmutter-Barak at CS.YALE.EDU Begin Pearlmutter quote: Of course, you are studing the "cyclic online presentation" case, which, although it fulfills the conditions of stochastic gradient, also has some additional structure. However, it is easy to convince yourself that this case does not permit a fixed step size. Consider a 1-d system, with two patterns, one of which gives $E_1 = (w-1)^2$ and the other $E_2 = (w+1)^2.$ Notice that $E = E_1 + E_2$ has a minimum at $w=0$. But with a step size that does not go to zero, $w$ will flip back and forth forever. --Barak Pearlmutter. End quote One of the interesting results of the paper is that the rapidity of convergence can be judged in terms of the redundancy of the training set, where (as mentioned in the announcement) redundancy is measured in terms of the correlation (inner product) between pairs of samples. A more thorough analysis is needed, but at first glance, it appears that if the redundancy (collinearity, correlation) measure is $R \in [-1, 1]$, then the convergence rate $C$ (how fast the algorithm approaches the minimum) is given by C = 1 / (1 - abs(R)). One may consider the most redundant pair of samples to dominate the convergence rate (one place where more analysis is needed). If a pair of samples is collinear, then as you have pointed out, the convergence rate goes to zero. The above equation gives this directly, since the redundancy R is 1 in that case. If all of the samples are orthogonal, on the other hand, the per-sample algorithm will find the absolute minimum in one epoch. For intermediate degrees of collinearity, $C$ gives a measure of the severity of the "cross-stitching" which the MGD algorithm will suffe, numbers closer to zero indicating a slower convergence. All of the above discussion is in terms of the linear network described in my paper, but it is hard to see how adding nonlinnearity to the problem can improve the situation, except by chance. Thomas H. Hildebrandt CSEE Department Lehigh University  From thildebr at aragorn.csee.lehigh.edu Mon Mar 30 10:47:48 1992 From: thildebr at aragorn.csee.lehigh.edu (Thomas H. Hildebrandt ) Date: Mon, 30 Mar 92 10:47:48 -0500 Subject: Paper on Neocognitron Training avail on neuroprose In-Reply-To: David Lovell's message of Thu, 12 Mar 92 10:48:29 EST <9203120048.AA02305@c10.elec.uq.oz.au> Message-ID: <9203301547.AA10142@aragorn.csee.lehigh.edu> About two weeks ago, David Lovell posted to CONNECTIONISTS advertising the note which he has placed in the neuroprose archive. I have a few comments on the paper. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% A NOTE ON A CLOSED-FORM TRAINING ALGORITHM FOR THE NEOCOGNITRON David Lovell, Ah Chung Tsoi & Tom Downs Intelligent Machines Laboratory, Department of Electrical Engineering University of Queensland, Queensland 4072, Australia In this note, a difficulty with the application of Hildebrandt's closed-form training algorithm for the neocognitron is reported. In applying this algorithm we have observed that S-cells frequently fail to respond to features that they have been trained to extract. We present results which indicate that this training vector rejection is an important factor in the overall classification performance of the neocognitron trained using Hildebrandt's procedure. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% I my paper: "Optimal training of thresholded linear correlation classifiers", IEEE Transactions on Neural Networks, Nov 1991. a one-step training procedure is outlined, which configures the cells in a layer of Fukushima-type neurons, so that their classification regions are mutually exclusive. Starting at 3 dimensions, this mutual exclusion condition causes gaps to open up among the cones that define the classification regions. An input pattern which falls into one of these gaps will be rejected. As the number of dimensions considered increases, so does the relative volume assigned to these rejection regions. In the general linear model presented in the paper, if the number of classes is fewer than the number of dimensions, then the half-angle at the vertex of the cone is set to 45 degrees, in order to obtain mutual exclusion. On the other hand, to obtain complete coverage (no rejections), it is necessary to choose an angle which depends on the number of dimensions as follows: vertex half-angle = 1 / sqrt(dimensions) So for the first 10 dimensions, the angles at which complete coverage is achieved are given by (in degrees): 0 45 54.73 60 63.43 65.9 67.79 69.29 70.53 71.56 In Fukushima's training procedure, classification regions compete for training patterns, so that in its final state, the configuration of the network more closely resembles one which achieves complete coverage than one which achieves mutual exclusion of the classification regions. In classification problems, the correct classification rate and the rejection rate are fundamentally in opposition. Therefore, it is not surprising that the network trained using Fukushima's procedure achieved a higher classification rate than the one trained using my one-step training procedure. A fairer comparison would be obtained by relaxing the mutual exclusion of REGIONS in the latter network to the mutual exclusion of SAMPLES (i.e. that a training sample falls in one and only one classification region). In that case, the rejection rate for my network is expected to be lower, in general, and the classification rate correspondingly higher. Thomas H. Hildebrandt CSEE Department Lehigh University  From dhg at scs.carleton.ca Mon Mar 30 19:30:58 1992 From: dhg at scs.carleton.ca (Daryl Herbert Graf) Date: Mon, 30 Mar 92 19:30:58 EST Subject: Theory of Neural Group Selection Message-ID: <9203310030.AA23100@scs.carleton.ca> I am looking for critical reviews of Gerald Edelman's theory of neuronal group selection. I have read the material in the attached bibliography and I would like to balance this with additional papers, analyses, observations, or opinions regarding this work from the connectionist, and neuroscience communities. In order to avoid cluttering the news, I would ask those who wish to reply to do so directly to me. I will post a summary in the near future. Many thanks in advance. Daryl Graf Study Group on Evolutionary Computing Techniques School of Computer Science Carleton University Ottawa, Ontario, Canada email: dhg at scs.carleton.ca Bibliography: Edelman, G.M. (1987) "Neural Darwinism: the Theory of Neuronal Group Selection", Basic, NY Edelman, G.M. (1989) "The Remembered Present: a Biological Theory of Consciousness", Basic, NY Edelman, G.M. (1981) Group selection as the basis for higher brain function. In "The Organization of the Cerebral Cortex", F.O. Schmitt, F.G. Worden, G. Adelman, S.G. Dennis, eds., pp. 535-563, MIT Press, Cambridge, Mass. Edelman, G.M. (1978) Group selection and phasic reentrant signalling: a theory of higher brain function. In "The Mindful Brain", G.M. Edelman, V.B. Mountcastle, eds., pp. 51-100, MIT Press, Cambridge, Mass. Edelman, G.M., Finkel, L.H. (1984) Neuronal group selection in the cerebral cortex. In "Dynamic Aspects of Neocortical Function", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., pp. 653-695. Wiley, NY Edelman, G.M., Reeke, G.N., (1982) Selective networks capable of representative transformations, limited generalizations, and associative memory, Proc. Natl. Acad. Sci. USA 79:2091-2095 Finkel, L.H., Edelman, G.M. (1987) Population rules for synapses in networks. In "Synaptic Function", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., pp.711-757, Wiley, NY Finkel, L.H., Edelman, G.M. (1985) Interaction of synaptic modification rules within populations of neurons, Proc. Natl. Acad. Sci. USA 82:1291-1295 Reeke, G.N., Finkel, L.H., Sporns, O., G.M. Edelman, (1989) Synthetic neural modeling: a multilevel approach to the analysis of brain complexity. In "Signal and Sense: Local and Global Order in Perceptual Maps", G.M. Edelman, W.E. Gall, W.M. Cowan, eds., Wiley, NY, pp. 607-707  From netgene at virus.fki.dth.dk Fri Mar 20 07:30:41 1992 From: netgene at virus.fki.dth.dk (virus mail server) Date: Fri, 20 Mar 92 13:30:41 +0100 Subject: HUMOPS: NetGene splice site prediction Message-ID: ------------------------------------------------------------------------ NetGene Neural Network Prediction of Splice Sites Reference: Brunak, S., Engelbrecht, J., and Knudsen, S. (1991). Prediction of Human mRNA donor and acceptor sites from the DNA sequence. Journal of Molecular Biology 220:49-65. ------------------------------------------------------------------------ Report ERRORS to Jacob Engelbrecht engel at virus.fki.dth.dk. Potential splice sites are assigned by combining output from a local and a global network. The prediction is made with two cutoffs: 1) Highly confident sites (no or few false positives, on average 50% of the true sites detected); 2) Nearly all true sites (more false positives - on average of all positions 0.1% false positive donor sites and 0.4% false positive acceptor sites, at 95% detection of true sites). The network performance on sequences from distantly related organisms has not been quantified. Due to the non-local nature of the algorithm sites closer than 225 nucleotides to the ends of the sequence cannot be assigned. Column explanations, field identifiers: POSITION in your sequence (either first or last base in intron). Joint CONFIDENCE level for the site (relative to the cutoff). EXON INTRON gives 20 bases of sequence around the predicted site. LOCAL is the site confidence from the local network. GLOBAL is the site confidence from the global network. ------------------------------------------------------------------------ The sequence: HUMOPS contains 6953 bases, and has the following composition: A 1524 C 2022 G 1796 T 1611 1) HIGHLY CONFIDENT SITES: ========================== ACCEPTOR SITES: POSITION CONFIDENCE INTRON EXON LOCAL GLOBAL 4094 0.27 TGTCCTGCAG^GCCGCTGCCC 0.63 0.66 5167 0.20 TGCCTTCCAG^TTCCGGAACT 0.59 0.64 3812 0.17 CTGTCCTCAG^GTACATCCCC 0.68 0.54 3164 0.02 TCCTCCTCAG^TCTTGCTAGG 0.79 0.32 2438 0.01 TGCCTTGCAG^GTGAAATTGC 0.78 0.33 DONOR SITES: POSITION CONFIDENCE EXON INTRON LOCAL GLOBAL 3979 0.38 CGTCAAGGAG^GTACGGGCCG 0.92 0.74 2608 0.17 GCTGGTCCAG^GTAATGGCAC 0.85 0.54 4335 0.06 GAACAAGCAG^GTGCCTACTG 0.83 0.41 2) NEARLY ALL TRUE SITES: ========================= ACCEPTOR SITES: POSITION CONFIDENCE INTRON EXON LOCAL GLOBAL 4094 0.55 TGTCCTGCAG^GCCGCTGCCC 0.63 0.66 3812 0.52 CTGTCCTCAG^GTACATCCCC 0.68 0.54 3164 0.49 TCCTCCTCAG^TCTTGCTAGG 0.79 0.32 5167 0.49 TGCCTTCCAG^TTCCGGAACT 0.59 0.64 2438 0.48 TGCCTTGCAG^GTGAAATTGC 0.78 0.33 4858 0.39 TCATCCATAG^AAAGGTAGAA 0.77 0.20 3712 0.36 CCTTTTCCAG^GGAGGGAATG 0.88 -0.01 4563 0.33 CCCTCCACAG^GTGGCTCAGA 0.81 0.05 5421 0.33 TTTTTTTAAG^AAATAATTAA 0.75 0.13 3783 0.29 TCCCTCACAG^GCAGGGTCTC 0.64 0.26 3173 0.25 GTCTTGCTAG^GGTCCATTTC 0.52 0.36 4058 0.24 CTCCCTGGAG^GAGCCATGGT 0.43 0.51 1784 0.22 TCACTGTTAG^GAATGTCCCA 0.68 0.08 6512 0.21 CCCTTGCCAG^ACAAGCCCAT 0.67 0.08 2376 0.20 CCCTGTCTAG^GGGGGAGTGC 0.61 0.16 1225 0.18 CCCCTCTCAG^CCCCTGTCCT 0.65 0.07 1743 0.13 TTCTCTGCAG^GGTCAGTCCC 0.62 0.03 3834 0.13 GGGCCTGCAG^TGCTCGTGTG 0.26 0.58 4109 0.13 TGCCCAGCAG^CAGGAGTCAG 0.29 0.54 6557 0.13 CATTCTGGAG^AATCTGCTCC 0.56 0.12 1638 0.11 CCATTCTCAG^GGAATCTCTG 0.62 0.00 247 0.10 GCCTTCGCAG^CATTCTTGGG 0.55 0.11 6766 0.09 CTATCCACAG^GATAGATTGA 0.64 -0.06 906 0.08 AATTTCACAG^CAAGAAAACT 0.61 -0.02 6499 0.08 CAGTTTCCAG^TTTCCCTTGC 0.55 0.06 378 0.07 GTACCCACAG^TACTACCTGG 0.24 0.52 3130 0.07 CTGTCTCCAG^AAAATTCCCA 0.51 0.12 4272 0.07 ACCATCCCAG^CGTTCTTTGC 0.58 0.00 4522 0.07 TGAATCTCAG^GGTGGGCCCA 0.51 0.12 5722 0.07 ACCCTCGCAG^CAGCAGCAAC 0.55 0.05 2316 0.06 CTTCCCCAAG^GCCTCCTCAA 0.40 0.27 2357 0.06 GCCTTCCTAG^CTACCCTCTC 0.39 0.28 2908 0.06 TTTGGTCTAG^TACCCCGGGG 0.51 0.10 4112 0.06 CCAGCAGCAG^GAGTCAGCCA 0.25 0.50 1327 0.05 TTTGCTTTAG^AATAATGTCT 0.52 0.06 844 0.04 GTTTGTGCAG^GGCTGGCACT 0.62 -0.11 1045 0.04 TCCCTTGGAG^CAGCTGTGCT 0.54 0.01 1238 0.03 CTGTCCTCAG^GTGCCCCTCC 0.50 0.06 2976 0.03 CCTAGTGCAG^GTGGCCATAT 0.62 -0.12 3825 0.03 CATCCCCGAG^GGCCTGCAGT 0.16 0.60 1508 0.02 TGAGATGCAG^GAGGAGACGC 0.43 0.16 2257 0.02 CTCTCCTCAG^CGTGTGGTCC 0.53 0.00 5712 0.02 ATCCTCTCAG^ACCCTCGCAG 0.51 0.05 2397 0.00 CCCTCCTTAG^GCAGTGGGGT 0.41 0.16 4800 0.00 CATTTTCTAG^CTGTATGGCC 0.47 0.07 5016 0.00 TGCCTAGCAG^GTTCCCACCA 0.59 -0.11 DONOR SITES: POSITION CONFIDENCE EXON INTRON LOCAL GLOBAL 3979 0.75 CGTCAAGGAG^GTACGGGCCG 0.92 0.74 2608 0.51 GCTGGTCCAG^GTAATGGCAC 0.85 0.54 4335 0.38 GAACAAGCAG^GTGCCTACTG 0.83 0.41 656 0.32 ACCCTGGGCG^GTATGAGCCG 0.56 0.66 5859 0.11 ACCAAAAGAG^GTGTGTGTGT 0.85 0.07 4585 0.09 GCTCACTCAG^GTGGGAGAAG 0.86 0.03 1708 0.06 TGGCCAGAAG^GTGGGTGTGC 0.85 0.01 6196 0.05 CCCAATGAGG^GTGAGATTGG 0.86 -0.01 667 0.03 TATGAGCCGG^GTGTGGGTGG 0.23 0.71 ------------------------------------------------------------------------