From aweigend at stern.nyu.edu Sat Aug 1 09:39:38 1998 From: aweigend at stern.nyu.edu (Andreas Weigend) Date: Sat, 1 Aug 1998 09:39:38 -0400 (EDT) Subject: Computational Finance Jan 6-8 1999 at NYU/Stern: CFP and Registration Message-ID: <199808011339.JAA05053@sabai.stern.nyu.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 8543 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/a15f186d/attachment.ksh From Tom.Mitchell at cs.cmu.edu Tue Aug 4 18:06:46 1998 From: Tom.Mitchell at cs.cmu.edu (Tom Mitchell) Date: Tue, 04 Aug 1998 18:06:46 -0400 Subject: Proposed Center for Science of Learning Message-ID: A Proposed National Center for the Science of Learning Are you interested in helping create a new national center on the science of learning? Would you like to raise the priority of research in this area at the national level? Would you be interested in visiting such a Center to interact with researchers around the world studying learning? We (a few dozen faculty from Carnegie Mellon and University of Pittsburgh) seek your participation in an international center to accelerate research on all forms of learning. At present we are writing one of 44 final proposals that will compete for approximately 10 NSF Science and Technology Centers. To our knowledge, ours is the only proposed center with an emphasis on learning. The center mission is to work toward a general science of learning that spans multiple paradigms including Automated learning: computer learning algorithms, data mining, robot learning, ... Theoretical foundations: computational learning theory, statistics, information theory ... Biological learning: neurobiology, cognitive models, education, automated tutors,.. The Center, while based in Pittsburgh, intends to involve researchers and educators from around the nation and the world in a variety of ways. Therefore, we seek your advice and your participation. Please take a moment to indicate which of the following you would like to see as Center activities: 1. Sponsor several workshops each year on specific learning issues? Are you likely to propose a workshop / attend / send someone? (circle any that apply) Suggestions (including possible topics): 2. Sponsor visits by undergrads / grad students / postdocs / faculty for a few days / summer / a year to participate in Center courses / research / other ? Are you potentially interested in a sponsored visit / sending someone to visit? Suggestions? 3. Sponsor a community-wide web repository of relevant research papers / commentary / standard data sets / software / online demos / course materials / other (working with current services such as the UCI repository, STATLIB, etc.)? Are you potentially interested in contributing to / using / helping manage the repository? Suggestions? 4. Sponsor a summer school taught by instructors from multiple institutions? Or courses accessible via the web to students worldwide, and taught by instructors worldwide via video network? Are you potentially interested in helping teach a course / taking a course? Suggestions? 5. Support our research community in other ways? Please suggest how: 6. Please classify yourself as: faculty / student / researcher / practitioner / manager / other academia / industry / government / other If you would like to be on the mailing list for Center discussions, please include your email address: You may return responses by email to learn at cs.cmu.edu. For more information on the center, including excerpts from our pre-proposal to NSF, see http://www.cs.cmu.edu/~tom/stc.html, or contact one of the co-PI's Stephen.Fienberg at cmu.edu, Kevin Ashley (ashley+pitt.edu), Jay.McClelland at cmu.edu, and Tom.Mitchell at cmu.edu. From aonishi at bpe.es.osaka-u.ac.jp Thu Aug 6 08:33:25 1998 From: aonishi at bpe.es.osaka-u.ac.jp (Toru Aonishi) Date: Thu, 6 Aug 1998 21:33:25 +0900 Subject: Paper available Message-ID: <199808061233.VAA09932@fsunj.bpe.es.osaka-u.ac.jp> Dear Colleagues, The following paper is available in postscript form from ftp://ftp.bpe.es.osaka-u.ac.jp/pub/FukushimaLab/Papers/aonishi/paper_prl.tar.gz Thanks, Toru Aonishi aonishi at bpe.es.osaka-u.ac.jp ----------------- Title: A statistical mechanics of an oscillator associative memory with scattered natural frequencies Authors: Toru Aonishi, Koji Kurata and Masato Okada cond-mat/9808059 Abstract: We analyze an oscillator associative memory with scattered natural frequencies in memory retrieval states by a statistical mechanical method based on the SCSNA and the Sakaguchi-Kuramoto theory. The system with infinite stored patterns has a frustration on the synaptic weights. In addition, it is numerically shown that almost all oscillators synchronize under memory retrieval, but desynchronize under spurious memory retrieval, when setting optimal parameters. Thus, it is possible to determine whether the recalling process is successful or not using information about the synchrony/asynchrony. The solvable toy model presented in this paper may be a good candidate for showing the validity of the synchronized population coding in the brain proposed by Phillips and Singer. From ajain at iconixpharm.com Thu Aug 6 16:53:45 1998 From: ajain at iconixpharm.com (Ajay Jain) Date: 06 Aug 98 13:53:45 -0700 Subject: Positions available: Iconix Pharmaceuticals Message-ID: ***** PLEASE FORWARD TO RELEVANT MAILING LISTS ***** I am seeking one or two individuals for PhD-level scientist positions at Iconix Pharmaceuticals. People with applied neural network experience are particularly of interest (see details below). Iconix is a San Francisco Bay Area biopharmaceutical company that is establishing the new area of Chemical Genomics. The company's approach aims to advance the discovery process for human drugs through the systematic acquisition, integration, and analysis of genetic and chemical information. We are seeking people to develop and implement new algorithms for interpreting large, complex data sets generated from the interaction of many small molecules with numerous gene products. Ideal candidates will have a PhD in computer science and demonstrated success in applying sophisticated computation to real-world problems (e.g. drug discovery, computational biology, object recognition, robotics, etc.). Experience in machine-learning, computational geometry, or physical modeling is beneficial, as is formal training in chemistry, biology, or physics. Experience in applying neural network techniques to problems involving noisy data is particularly relevant. As the member of an interdisciplinary team, you will develop, implement, and apply novel algorithms to the interpretation of chemical and biological information. Candidates for these positions will have excellent communication skills, excellent work history and references, and the ability to work both independently and as part of a multidisciplinary team of scientists, including colleagues from biology, chemistry, genetics, and other fields. We offer competitive salaries, stock options, and excellent benefits. Please send or fax your resume to: Human Resources Dept., Iconix Pharmaceuticals, Inc., 850 Maude Avenue, Mountain View, CA 94043. Attn: Chemical Information Sciences Group. FAX (650) 526-3034, EMAIL hr at iconixpharm.com. Iconix Pharmaceuticals, Inc., is an Equal Opportunity Employer. The type of research involved is exemplified by the following papers: J. Ruppert, W. Welch, and A. N. Jain. Automatic Characterization and Identification of Protein Binding Pockets for Molecular Docking. Protein Science 6: 524-533, 1996. A. N. Jain. Scoring Non-Covalent Ligand-Protein Interactions: A Continuous Differentiable Function Tuned to Compute Binding Affinities. Journal of Computer-Aided Molecular Design. 10:5, 427-440, 1996. W. Welch, J. Ruppert, and A. N. Jain. Hammerhead: Fast, Fully Automated Docking of Flexible Ligands to Protein Binding Sites. Chemistry and Biology 3: 449-462, 1996. A. N. Jain, N. L. Harris, and J. Y. Park. Quantitative Binding Site Model Generation: Compass Applied to Multiple Chemotypes Targeting the 5HTlA Receptor. Journal of Medicinal Chemistry 38: 1295-1307, 1995. A. N. Jain, T. G. Dietterich, R. L. Lathrop, D. Chapman, R. E. Critchlow, B. E. Bauer, T. A. Webster, and T. Lozano-Perez. Compass: A Shape-Based Machine Learning Tool for Drug Design. Journal of Computer-Aided Molecular Design 8(6): 635-652, 1994. A. N. Jain, K. Koile, D. Chapman. Compass: Predicting Biological Activities from Molecular Surface Properties; Performance Comparisons on a Steroid Benchmark. Journal of Medicinal Chemistry 37: 2315-2327, 1994. T. G. Dietterich, A. N. Jain, R. L. Lathrop, and T. Lozano-Perez. A Comparison of Dynamic Reposing and Tangent Distance for Drug Activity Prediction. In Advances in Neural information Processing Systems 6, ed. J. D. Cowan, G. Tesauro, and J. Alspector. San Francisco, CA: Morgan Kaufmann. 1994. -------------------------------------------------------------------- Dr. Ajay N. Jain Associate Director, Chemical Information Sciences 850 Maude Ave. Iconix Pharmaceuticals Mountain View, CA 94043 ajain at iconixpharm.com From cindy at cns.bu.edu Thu Aug 6 14:43:12 1998 From: cindy at cns.bu.edu (Cynthia Bradford) Date: Thu, 6 Aug 1998 14:43:12 -0400 Subject: Boston University: Cognitive and Neural Systems 1999 Meeting Message-ID: <199808061843.OAA28727@retina.bu.edu> *****CALL FOR PAPERS***** THIRD INTERNATIONAL CONFERENCE ON COGNITIVE AND NEURAL SYSTEMS May 26-29, 1999 Sponsored by Boston University's Center for Adaptive Systems and Department of Cognitive and Neural Systems with financial support from DARPA and ONR How Does the Brain Control Behavior? How Can Technology Emulate Biological Intelligence? The conference will include invited tutorials and lectures, and contributed lectures and posters by experts on the biology and technology of how the brain and other intelligent systems adapt to a changing world. The conference is aimed at researchers and students of computational neuroscience, connectionist cognitive science, artificial neural networks, neuromorphic engineering, and artificial intelligence. A single oral or poster session enables all presented work to be highly visible. Abstract submissions encourage submissions of the latest results. Costs are kept at a minimum without compromising the quality of meeting handouts and social events. CALL FOR ABSTRACTS Session Topics: * vision * spatial mapping and navigation * object recognition * neural circuit models * image understanding * neural system models * audition * mathematics of neural systems * speech and language * robotics * unsupervised learning * hybrid systems (fuzzy, evolutionary, digital) * supervised learning * neuromorphic VLSI * reinforcement and emotion * industrial applications * sensory-motor control * other * cognition, planning, and attention Contributed Abstracts must be received, in English, by January 29, 1999. Notification of acceptance will be given by February 28, 1999. A meeting registration fee of $45 for regular attendees and $30 for students must accompany each Abstract. See Registration Information for details. The fee will be returned if the Abstract is not accepted for presentation and publication in the meeting proceedings. Registration fees of accepted abstracts will be returned on request only until April 15, 1999. Each Abstract should fit on one 8.5" x 11" white page with 1" margins on all sides, single-column format, single-spaced, Times Roman or similar font of 10 points or larger, printed on one side of the page only. Fax submissions will not be accepted. Abstract title, author name(s), affiliation(s), mailing, and email address(es) should begin each Abstract. An accompanying cover letter should include: Full title of Abstract; corresponding author and presenting author name, address, telephone, fax, and email address; and a first and second choice from among the topics above, including whether it is biological (B) or technological (T) work. Example: first choice: vision (T); second choice: neural system models (B). (Talks will be 15 minutes long. Posters will be up for a full day. Overhead, slide, and VCR facilities will be available for talks.) Abstracts which do not meet these requirements or which are submitted with insufficient funds will be returned. Accepted Abstracts will be printed in the conference proceedings volume. No longer paper will be required. The original and 3 copies of each Abstract should be sent to: Cynthia Bradford, Boston University, Department of Cognitive and Neural Systems, 677 Beacon Street, Boston, MA 02215. REGISTRATION INFORMATION: Early registration is recommended. To register, please fill out the registration form below. Student registrations must be accompanied by a letter of verification from a department chairperson or faculty/research advisor. If accompanied by an Abstract or if paying by check, mail to the address above. If paying by credit card, mail as above, or fax to (617) 353-7755, or email to cindy at cns.bu.edu. The registration fee will help to pay for a reception, 6 coffee breaks, and the meeting proceedings. STUDENT FELLOWSHIPS: Fellowships for PhD candidates and postdoctoral fellows are available to cover meeting travel and living costs. The deadline to apply for fellowship support is January 29, 1999. Applicants will be notified by February 28, 1999. Each application should include the applicant's CV, including name; mailing address; email address; current student status; faculty or PhD research advisor's name, address, and email address; relevant courses and other educational data; and a list of research articles. A letter from the listed faculty or PhD advisor on official institutional stationery should accompany the application and summarize how the candidate may benefit from the meeting. Students who also submit an Abstract need to include the registration fee with their Abstract. Reimbursement checks will be distributed after the meeting. REGISTRATION FORM Third International Conference on Cognitive and Neural Systems Department of Cognitive and Neural Systems Boston University 677 Beacon Street Boston, Massachusetts 02215 Tutorials: May 26, 1999 Meeting: May 27-29, 1999 FAX: (617) 353-7755 (Please Type or Print) Mr/Ms/Dr/Prof: _____________________________________________________ Name: ______________________________________________________________ Affiliation: _______________________________________________________ Address: ___________________________________________________________ City, State, Postal Code: __________________________________________ Phone and Fax: _____________________________________________________ Email: _____________________________________________________________ The conference registration fee includes the meeting program, reception, two coffee breaks each day, and meeting proceedings. The tutorial registration fee includes tutorial notes and two coffee breaks. CHECK ONE: ( ) $70 Conference plus Tutorial (Regular) ( ) $45 Conference plus Tutorial (Student) ( ) $45 Conference Only (Regular) ( ) $30 Conference Only (Student) ( ) $25 Tutorial Only (Regular) ( ) $15 Tutorial Only (Student) METHOD OF PAYMENT (please fax or mail): [ ] Enclosed is a check made payable to "Boston University". Checks must be made payable in US dollars and issued by a US correspondent bank. Each registrant is responsible for any and all bank charges. [ ] I wish to pay my fees by credit card (MasterCard, Visa, or Discover Card only). Name as it appears on the card: _____________________________________ Type of card: _______________________________________________________ Account number: _____________________________________________________ Expiration date: ____________________________________________________ Signature: __________________________________________________________ From giro at open.brain.riken.go.jp Fri Aug 7 02:27:37 1998 From: giro at open.brain.riken.go.jp (Dr. Mark Girolami) Date: Fri, 07 Aug 1998 15:27:37 +0900 Subject: Two ICA Papers Available Message-ID: <35CA9E59.5929@open.brain.riken.go.jp> Dear Colleagues, The following papers are available in gzipped postscript form at the following website http://www-cis.paisley.ac.uk/scripts/staff.pl//giro-ci0/index.html Many Thanks Best Rgds Mark Girolami -------------------------------------------------------------------- Title: An Alternative Perspective on Adaptive Independent Component Analysis Algorithms Author: Mark Girolami Publication: Neural Computation, Vol.10, No.8, pp 2103-2114, 1998. Abstract: This paper develops an extended independent component analysis algorithm for mixtures of arbitrary sub-Gaussian and super-Gaussian sources. The Gaussian mixture model of Pearson is employed in deriving a closed-form generic score function for strictly sub-Gaussian sources. This is combined with the score function for a uni-modal super-Gaussian density to provide a computationally simple yet powerful algorithm for performing independent component analysis on arbitrary mixtures of non-Gaussian sources. ------------------------------------- Title: A Common Neural Network Model for Unsupervised Exploratory Data Analysis and Independent Component Analysis Authors: Mark Girolami, Andrzej Cichocki, and Shun-Ichi Amari Publication: I.E.E.E Transactions on Neural Networks, To Appear. Abstract: This paper presents the derivation of an unsupervised learning algorithm, which enables the identification and visualisation of latent structure within ensembles of high dimensional data. This provides a linear projection of the data onto a lower dimensional subspace to identify the characteristic structure of the observations independent latent causes. The algorithm is shown to be a very promising tool for unsupervised exploratory data analysis and data visualisation. Experimental results confirm the attractiveness of this technique for exploratory data analysis and an empirical comparison is made with the recently proposed Generative Topographic Mapping (GTM) and standard principal component analysis (PCA). Based on standard probability density models a generic nonlinearity is developed which allows both; 1) identification and visualisation of dichotomised clusters inherent in the observed data and, 2) separation of sources with arbitrary distributions from mixtures, whose dimensionality may be greater than that of number of sources. The resulting algorithm is therefore also a generalised neural approach to independent component analysis (ICA) and it is considered to be a promising method for analysis of real world data that will consist of sub and super-Gaussian components such as biomedical signals. -- ---------------------------------------------- Dr. Mark Girolami (TM) RIKEN, Brain Science Institute Laboratory for Open Information Systems 2-1 Hirosawa, Wako-shi, Saitama 351-01, Japan Email: giro at open.brain.riken.go.jp Tel: +81 48 467 9666 Tel: +81 48 462 3769 (apartment) Fax: +81 48 467 9694 --------------------------------------------- Currently on Secondment From: Department of Computing and Information Systems University of Paisley High Street, PA1 2BE Scotland, UK Email: giro0ci at paisley.ac.uk Tel: +44 141 848 3963 Fax: +44 141 848 3542 Secretary: Mrs E Campbell Tel: +44 141 848 3966 --------------------------------------------- From sshams at biodiscovery.com Fri Aug 7 19:05:23 1998 From: sshams at biodiscovery.com (Soheil Shams) Date: Fri, 07 Aug 1998 16:05:23 -0700 Subject: Position: Image Processing Scientist Message-ID: <35CB8833.CD397DEA@biodiscovery.com> We are looking for a talented, creative individual with a strong background in image processing, machine vision, neural networks, or pattern recognition. This position involves development and implementation of machine vision and image processing algorithms for a number of ongoing and planned projects in high-throughput genetic expression analysis. The position requires the ability to formulate problem descriptions through interaction with end-user customers. These technical issues must then be transformed into innovative and practical algorithmic solutions. We expect our scientists to have outstanding written/oral communication skills and encourage publications in scientific journals. Requirements: A Ph.D. in Electrical Engineering, Computer Science, Physics, or related field, or equivalent experience is required. Knowledge of biology and genetics is a plus but not necessary. Two years experience with MatLab? and at least 5 years programming experience is required. Commercial software development experience is a plus. For consideration, please send your r?sum? along with a cover letter to E-mail: HR at BioDiscovery.com Fax: (310) 966-9346 Snail-mail: BioDiscovery, Inc. 11150 W. Olympic Blvd. Suite 805E Los Angeles, CA 90064 Please include pointers to your work online if applicable. BioDiscovery, Inc. is an early-stage start-up company dedicated to the development of state-of-the-art bioinformatics tools for molecular biology and genomics research applications. We are a leading gene expression image and data analysis firm with an outstanding client list and a progressive industry stance. We are rapidly growing and are looking for talented individuals with experience and motivations to take on significant responsibility and deliver with minimal supervision. BioDiscovery is an equal opportunity employer and our shop has a friendly, intense atmosphere. We are headquartered in sunny Southern California close to the UCLA campus. From emilosam at ubbg.etf.bg.ac.yu Sat Aug 8 01:23:32 1998 From: emilosam at ubbg.etf.bg.ac.yu (Milosavljevic Milan) Date: Sat, 8 Aug 1998 07:23:32 +0200 Subject: Paper in Electronic Letters Message-ID: <01bdc28c$aedbeae0$LocalHost@milan> Dear Connectionist, The following paper will be published at Electonic Letters, No.16, ( August 1998). Printed copy can be obtain upon request sendin e-mail to l.cander at rl.ac.uk or emilosam at ubbg.etf.bg.ac.yu ---------------------------------------------------------------------- Title: Ionospheric forecasting technique by artificial neural network Authors: Ljijana Cander (1), Milan Milosavljevic(2,3), Srdjan Stankovic(2), Sasa Tomasevic (2) (1) Rutherford Appleton Laboratory, Radio Communications Research Unit Chilton, Didcot, Oxon OX11 0QX, UK (2) Faculty of Electrical Engineering, Belgrade University, Yugoslavia (3) Institute for Applied Mathematics and Electronics, Belgrade, Yugoslavia Abstract: Artificial neural network method is applied to the development of ionospheric forecasting technique for one hour ahead. Comparison between observed and predicted values of the critical frequency of F2 layer, foF2, and the total electron content, TEC, are presented to show the appropriatness of the proposed technique. Best Regards, ------------------------------------------------------- Prof. Milan Milosavljevic Faculty of Electrical Engineering University of Belgrade Bulevar Revolucije 73 11000 Belgrade, Yugoslavia tel: (381 11 ) 324 84 64 fax: (381 11 ) 324 86 81 ----------------------------------------------------- Home address: Narodnih Heroja 20/32 11 000 Belgrade, Yugoslavia -------------------------------------------------------- tel./fax (home): (381 11) 672 616 e-mail: emilosam at ubbg.etf.bg.ac.yu From radford at cs.toronto.edu Sat Aug 8 17:06:47 1998 From: radford at cs.toronto.edu (Radford Neal) Date: Sat, 8 Aug 1998 17:06:47 -0400 Subject: Software for Bayesian modeling Message-ID: <98Aug8.170654edt.345@neuron.ai.toronto.edu> FREE SOFTWARE FOR BAYESIAN MODELING USING MARKOV CHAIN MONTE CARLO A new release of my software for flexible Bayesian modeling is available. This software is meant to support research and education regarding: * Flexible Bayesian models for regression and classification based on neural networks and Gaussian processes, and for probability density estimation using mixtures. Neural net training using early stopping is also supported. * Markov chain Monte Carlo methods, and their applications to Bayesian modeling, including implementations of Metropolis, hybrid Monte Carlo, slice sampling, and tempering methods. The neural network, Gaussian process, and mixture model facilties are updated from the previous version, with some new facilities, and some bugs fixed. New programs are provided for applying a variety of Markov chain methods to distributions that are defined using simple formulas. A BUGS-like notation can be used to define some Bayesian models. Note, however, that I have not attempted to provide comprehensive facilities for Bayesian modeling. The intent is more to demonstrate the Markov chain methods. One facility that may be of particular interest is that Annealed Importance Sampling can be used to find the marginal likelihood for a Bayesian model (though obtaining good results may require some fiddling). The software is written in C for use on Unix systems. It is free for research and educational purposes. You can get it from my web page. More directly, you can go to http://www.cs.utoronto.ca/~radford/fbm.software.html and follow the instruction there. Or you can just browse the documentation there to find out more about the software. There are also links to various papers that you might want to read before trying to use the software. Please let me know if you have any problems with obtaining or installing the software, or if you have any other comments. Radford Neal ---------------------------------------------------------------------------- Radford M. Neal radford at cs.utoronto.ca Dept. of Statistics and Dept. of Computer Science radford at utstat.utoronto.ca University of Toronto http://www.cs.utoronto.ca/~radford ---------------------------------------------------------------------------- From skoenig at cc.gatech.edu Mon Aug 10 14:28:06 1998 From: skoenig at cc.gatech.edu (Sven Koenig) Date: Mon, 10 Aug 1998 14:28:06 -0400 (EDT) Subject: CFP: AAAI Spring Symposium Message-ID: <199808101828.OAA08684@green.cc.gatech.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 4212 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/85df502b/attachment.ksh From takagi at artemis.ad.kyushu-id.ac.jp Mon Aug 10 07:03:16 1998 From: takagi at artemis.ad.kyushu-id.ac.jp (takagi) Date: Mon, 10 Aug 1998 20:03:16 +0900 Subject: abstract of IIZUKA'98 papers on WEB Message-ID: <199808101103.UAA11835@artemis.ad.kyushu-id.ac.jp> A non-text attachment was scrubbed... Name: not available Type: text Size: 797 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/bf888fca/attachment.ksh From Dave_Touretzky at cs.cmu.edu Tue Aug 11 03:34:27 1998 From: Dave_Touretzky at cs.cmu.edu (Dave_Touretzky@cs.cmu.edu) Date: Tue, 11 Aug 1998 03:34:27 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <23040.902820867@skinner.boltz.cs.cmu.edu> I'd like to start a debate on the current state of connectionist symbol processing? Is it dead? Or does progress continue? A few years ago I contributed a short article on "Connectionist and symbolic representations" to Michael Arbib's Handbook of Brain Theory and Neural Networks (MIT Press, 1995). In that article I explained concepts such as coarse coding, distributed representations, and RAAMs, and how people had managed to do elementary kinds of symbol processing tasks in this framework. The problem, though, was that we did not have good techniques for dealing with structured information in distributed form, or for doing tasks that require variable binding. While it is possible to do these things with a connectionist network, the result is a complex kludge that, at best, sort of works for small problems, but offers no distinct advantages over a purely symbolic implementation. The cases where people had shown interesting generalization behavior in connectionist nets involved simple vector-based representations, without nested structures or variable binding. People had gotten some interesting effects with localist networks, by doing spreading activation and a simple form of constraint satisfaction. A good example is the spreading activation models of word disambiguation developed in the 1980s by Jordan Pollack and Dave Walts, and by Gary Cottrell. But the heuristic reasoning enabled by spreading activation models is extremely limited. This approach does not create new structure on the fly, or deal with structured representations or variable binding. Those localist networks that did attempt to implement variable binding did so in a discrete, symbolic way that did not advance the parallel constraint satisfaction/heuristic reasoning agenda of earlier spreading activation research. So I concluded that connectionist symbol processing had reached a plateau, and further progress would have to await some revolutionary new insight about representations. The last really significant work in the area was, in my opinion, Tony Plate's holographic reduced representations, which offered a glimpse of how structured information might be plausibly manipulated in distributed form. (Tony received an IJCAI-91 best paper award for this work. For some reason, the journal version did not appear until 1995.) But further incremental progress did not seem possible. People still do cognitive modeling using connectionist networks. And there is some good work out there. One of my favorite examples is David Plaut's use of attractor neural networks to model deep and surface dyslexia -- an area pioneered by Geoff Hinton and Tim Shallice. But like most connectionist cognitive models, it relies on a simple feature vector representation. The problems of structured representations and variable binding have remained unsolved. No one is trying to build distributed connectionist reasoning systems any more, like the connectionist production system I built with Geoff Hinton, or Mark Derthick's microKLONE. Today, Michael Arbib is working on the second edition of his handbook, and I've been asked to update my article on connectionist symbol processing. Is it time to write an obituary for a research path that expired because the problems were too hard for the tools available? Or are there important new developments to report? I'd love to hear some good news. -- Dave Touretzky References: Arbib, M. A. (ed) (1995) Handbook of Brain Theory and Neural Networks. Cambridge, MA: MIT Press. Plaut, D. C., McClelland, J. L., Seidenberg, M. S., and Patterson, K. (1996) Understandig normal and impaired word reading: computational principles in quasi-regular domains. Psychological Review, 103(1):56-115. Plate, T. A. (1995) Holographic reduced representations. IEEE Transactions on Neural Networks, 6(3):623. Touretzky, D. S. and Hinton, G. E. (1988) A distributed connectionist production system. Cognitive Science, vol. 12, number 3, pp. 423-466. Touretzky, D. S. (1995) Connectionist and symbolic representations. In M. A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, pp. 243-247. MIT Press. From duch at MPA-Garching.MPG.DE Tue Aug 11 12:25:36 1998 From: duch at MPA-Garching.MPG.DE (Wlodek Duch) Date: Tue, 11 Aug 1998 18:25:36 +0200 Subject: EANN'99 first call for papers Message-ID: <35D07080.31DF@mpa-garching.mpg.de> EANN '99 Fifth International Conference on Engineering Applications of Neural Networks Warsaw, Poland 13-15 September 1999 WWW page: http://www.phys.uni.torun.pl/eann99/ First Call for Papers The conference is a forum for presenting the latest results on neural network applications in technical fields. The applications may be in any engineering or technical field, including but not limited to: systems engineering, mechanical engineering, robotics, process engineering, metallurgy, pulp and paper technology, aeronautical engineering, computer science, machine vision, chemistry, chemical engineering, physics, electrical engineering, electronics, civil engineering, geophysical sciences, biomedical systems, and environmental engineering. Summaries of two pages (about 1000 words) should be sent by e-mail to eann99 at phys.uni.torun.pl by 15 February 1999 in plain text, Tex or LaTeX format. Please mention two to four keywords. Submissions will be reviewed. For information on earlier EANN conferences see the WWW http://www.abo.fi/~abulsari/EANN98.html Notification of acceptance will be sent around 15 March. The final papers will be expected by 15 April. All papers will be upto 6 pages in length. Authors are expected to register by 15 April. Authors are encouraged to send the abstracts to the conference address instead of the organisers of the special tracks. Organising committee R. Baratti, University of Cagliari, Italy L. Bobrowski, Polish Academy of Science, Poland A. Bulsari, Nonlinear Solutions Oy, Finland W. Duch, Nicholas Copernicus University, Poland J. Fernandez de Canete, University of Malaga, Spain A. Ruano, University of Algarve, Portugal D. Tsaptsinos, Kingston University, UK National program committee L. Bobrowski, IBIB Warsaw A. Cichocki, RIKEN, Japan W. Duch, Nicholas Copernicus University T. Kaczorek, Warsaw Polytechnic J. Korbicz, Technical University of Zielona Gora L. Rutkowski, Czestochowa Polytechnic R. Tadeusiewicz, Academy of Mining and Metallurgy, Krakow Z. Waszczyszyn, Krakow Polytechnic International program committee (to be confirmed, extended) S. Cho, Pohang University of Science and Technology, Korea T. Clarkson, King's College, UK S. Draghici, Wayne State University, USA G. Forsgr?n, Stora Corporate Research, Sweden I. Grabec, University of Ljubljana, Slovenia A. Iwata, Nagoya Institute of Technology, Japan C. Kuroda, Tokyo Institute of Technology, Japan H. Liljenstr?m, Royal Institute of Technology, Sweden L. Ludwig, University of Tubingen, Germany M. Mohammadian, Monash University, Australia P. Myllykoski, Helsinki University of Technology, Finland A. Owens, DuPont, USA R. Parenti, Ansaldo Ricerche, Italy F. Sandoval, University of Malaga, Spain C. Schizas, University of Cyprus, Cyprus E. Tulunay, Middle East Technical University, Turkey S. Usui, Toyohashi University of Technology, Japan P. Zufiria, Polytechnic University of Madrid, Spain Electronic mail is not absolutely reliable, so if you have not heard from the conference secretariat after sending your abstract, please contact again. You should receive an abstract number in a couple of days after the submission. From roitblat at hawaii.edu Tue Aug 11 13:31:10 1998 From: roitblat at hawaii.edu (Herbert L. Roitblat) Date: Tue, 11 Aug 1998 07:31:10 -1000 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: I don't have answers, but I might add to the bibliography. Part of the problem with symbol processing approaches is that they are not as truly successful as they might pretend. The syntactic part of symbol processing is easy and reliable, but the semantic part is still a muddle. In English, for example, the words do not really have a systematic meaning as Fodor and others would assert. I have my own ideas on the matter, but here are a couple of references on the topic. Aizawa, K. (1997) Explaining systematicity. Mind and Language, 12, 115-136. Mathews, R. J. (1997) Can connectionists explain systematicity? Mind and Language, 12, 154-177. On Mon, 10 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? > .. > > People still do cognitive modeling using connectionist networks. And > there is some good work out there. One of my favorite examples is > David Plaut's use of attractor neural networks to model deep and > surface dyslexia -- an area pioneered by Geoff Hinton and Tim > Shallice. But like most connectionist cognitive models, it relies on > a simple feature vector representation. The problems of structured > representations and variable binding have remained unsolved. No one > is trying to build distributed connectionist reasoning systems any > more, like the connectionist production system I built with Geoff > Hinton, or Mark Derthick's microKLONE. > > Today, Michael Arbib is working on the second edition of his handbook, > and I've been asked to update my article on connectionist symbol > processing. Is it time to write an obituary for a research path that > expired because the problems were too hard for the tools available? > Or are there important new developments to report? > > I'd love to hear some good news. > > -- Dave Touretzky > > > References: > > Arbib, M. A. (ed) (1995) Handbook of Brain Theory and Neural Networks. > Cambridge, MA: MIT Press. > > Plaut, D. C., McClelland, J. L., Seidenberg, M. S., and Patterson, > K. (1996) Understandig normal and impaired word reading: computational > principles in quasi-regular domains. Psychological Review, > 103(1):56-115. > > Plate, T. A. (1995) Holographic reduced representations. IEEE > Transactions on Neural Networks, 6(3):623. > > Touretzky, D. S. and Hinton, G. E. (1988) A distributed connectionist > production system. Cognitive Science, vol. 12, number 3, pp. 423-466. > > Touretzky, D. S. (1995) Connectionist and symbolic representations. > In M. A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, > pp. 243-247. MIT Press. > Herbert Roitblat, Ph.D. Professor of Psychology roitblat at hawaii.edu University of Hawaii (808) 956-6727 (808) 956-4700 fax 2430 Campus Road, Honolulu, HI 96822 USA From jose at tractatus.rutgers.edu Tue Aug 11 16:09:48 1998 From: jose at tractatus.rutgers.edu (Stephen Jose Hanson) Date: Tue, 11 Aug 1998 16:09:48 -0400 Subject: POSTDOC LINE FOR FALL 98 Message-ID: <35D0A50C.36703D4@tractatus.rutgers.edu> The DEPARTMENT OF PSYCHOLOGY at RUTGERS UNIVERSITY-Newark Campus - COGNITIVE NEUROSCIENC POSTDOCTORAL Position A postdoctoral position that can be filled immediately running through Fall98/Spring99 with a possibility of a second year renewal (starting date Flexible). Area of specialization in connectionist modeling with applications to recurrent networks, image processing and cognitive neuroscience (functional imaging). Review of applications will begin immediately-but will continue to be accepted until the position is filled. Starting date is flexible in the Summer 98 time frame. Rutgers University is an equal opportunity/affirmative action employer. Qualified women and minority candidates are especially encouraged to apply. Send CV to Professor S. J. Hanson, Chair, Department of Psychology - Post Doc Search, Rutgers University, Newark, NJ 07102. Email enquiry's can be made to jose at psychology.rutgers.edu please include "POSTDOC" in the subject heading. From shastri at ICSI.Berkeley.EDU Tue Aug 11 21:15:57 1998 From: shastri at ICSI.Berkeley.EDU (Lokendra Shastri) Date: Tue, 11 Aug 1998 18:15:57 PDT Subject: Connectionist symbol processing: any progress? Message-ID: <199808120115.SAA16013@lassi.ICSI.Berkeley.EDU> Dear Connectionists, I am happy to report that work on connectionist symbol processing is not only alive and kicking, but also in robust and vigourous health. Are we a breadth away from answering all the hard questions about representation and learning? No. But are we making sufficient progress to warrant optimism? Certainly yes! A debate might be fun, but it seems better to start by pointing out a few other examples of relevant work in the field. This should enable those who have not followed the developments to judge for themselves. Here are some URLs (with many references) http://www.icsi.berkeley.edu/NTL/ http://www.icsi.berkeley.edu/~shastri/shrutibiblio.html http://cs.ua.edu/~rsun http://www.dcs.ex.ac.uk/~jamie/ http://www.psych.ucla.edu/Faculty/Hummel/ and here are a few references for readers with limited web access: Embodied Lexical Development, Proceedings of the Nineteenth Annual Meeting of the Cognitive Science Society COGSCI-97, Aug 9-11, Stanford: Stanford University Press, 1997. Bailey, D., J. Feldman, S. Narayanan, G. Lakoff (1997). Connectionist Syntactic Parsing Using Temporal Variable Binding. J. Henderson. Journal of Psycholinguistic Research, 23(5):353--379, 1994. The Human Semantic Potential: Spatial Language and Constrained Connectionism, T. Regier, Cambridge, MA: MIT Press. 1996. From paolo at uow.edu.au Tue Aug 11 20:58:30 1998 From: paolo at uow.edu.au (Paolo Frasconi) Date: Wed, 12 Aug 1998 10:58:30 +1000 (EST) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Adaptive techniques for dealing with structured information have recently emerged. In particular, algorithms and architectures for learning directed ordered acyclic graphs (DOAGs) are available and they have been shown to be effective in some application domains such as automated reasoning and pattern recognition. The basic idea is an extension of recurrent neural networks from sequences (which can be seen as a very special case of graphs having a linear-chain shape) to graphs. A generalization of backpropagation through time is available for acyclic graphs. Models for data structures can be conveniently represented in a graphical formalism which makes it simple to understand them as special cases of belief networks. There is still quite a lot of work to be done in this area and the interest is expanding. Last year we had a NIPS workshop on adaptive processing of data structures. Further details and links to papers can be found in the web page http://www.dsi.unifi.it/~paolo/datas (also mirrowed at http://www.uow.edu.au/~paolo/datas). Paolo Frasconi paolo at uow.edu.au Marco Gori marco at ing.unisi.it Alessandro Sperduti perso at di.unipi.it Paolo Frasconi Visiting Lecturer Faculty of Informatics University of Wollongong Phone: +61 2 4221 3121 (office) Northfields Avenue +61 2 4226 4925 (home) Wollongong NSW 2522 Fax: +61 2 4221 4843 AUSTRALIA http://www.uow.edu.au/~paolo From kmarg at uom.gr Wed Aug 12 04:30:35 1998 From: kmarg at uom.gr (Kostas Margaritis) Date: Wed, 12 Aug 1998 11:30:35 +0300 (EET DST) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Cognitive Maps (and Fuzzy Cognitive Maps) is another aspect of Graphical Belief Models which can be viewed as Recurrent Neural Networks leading to concepts relevant to Cognitive or Decision Dynamics. The original graphical cognitive model can me manipulated either as a directed graph (i.e. cycles, centrality etc) or as a dynamical system leading to behaviours similar to the dynamics of recurrent neural networks. Kostas Margaritis Dept. of Informatics Univ. of Macedonia Thessaloniki, Greece e-mail kmarg at uom.gr From r.gayler at psych.unimelb.edu.au Wed Aug 12 08:19:20 1998 From: r.gayler at psych.unimelb.edu.au (Ross Gayler) Date: Wed, 12 Aug 1998 22:19:20 +1000 Subject: Connectionist symbol processing: any progress? Message-ID: <3.0.32.19980812221919.006a0be0@myriad.unimelb.edu.au> At 03:34 11/08/98 -0400, you (Dave_Touretzky at cs.cmu.edu) wrote: >I'd like to start a debate on the current state of connectionist >symbol processing? Is it dead? Or does progress continue? > ... >So I concluded that connectionist symbol processing had reached a >plateau, and further progress would have to await some revolutionary >new insight about representations. The last really significant work >in the area was, in my opinion, Tony Plate's holographic reduced >representations, which offered a glimpse of how structured information >might be plausibly manipulated in distributed form. > ... >The problems of structured >representations and variable binding have remained unsolved. No one >is trying to build distributed connectionist reasoning systems any >more, like the connectionist production system I built with Geoff >Hinton, or Mark Derthick's microKLONE. Tony Plate is still alive and kicking. There was a small group of like-minded researchers present at the Analogy'98 workshop in Sofia. They would (and did) argue that methods related to Tony's HRRs do solve the problems of structured representations and variable binding. (The real problem is not how to represent structures but how to use those structures to drive cognitively useful operations.) The fact that these people turned up at an analogy workshop follows from a belief that analogy is the major mode of (connectionist) reasoning. The Sofia papers most related to HRRs are: Tony Plate Structured operations with distributed vector operations available from Tony's home page: http://www.mcs.vuw.ac.nz/~tap/ Pentti Kanerva Dual role of analogy in the design of a cognitive computer related older papers are available from: http://www.sics.se/nnrc/spc.html Ross Gayler & Roger Wales (note blatant self-promotion) Connections, binding, unification and analogical promiscuity Multiplicative binding, representation operators, and analogy both available from the computer science section of: http://cogprints.soton.ac.uk/ Other (non-HRR) connectionist analogy work from Sofia included: Keith Holyoak & John Hummel Analogy in a physical symbol system Graeme Halford , William Wilson & Steven Phillips Relational processing in higher cognition Julie McCredden NetAB: A neural network model of analogy by discovery Also, Chris Eliasmith (who was not at the Sofia workshop) has done some work re-implementing the ACME model of analogical mapping to be based on HRR's Check out his unpublications at: http://ascc.artsci.wustl.edu/~celiasmi/ Finally, anyone who wants to know about the Sofia workshop should check: http://www-psych.stanford.edu/~kalina/cogsci98/analogy_workshop.html and copies of the proceedings may be ordered from: analogy at cogs.nbu.acad.bg Cheers, Ross Gayler From marwan at ee.usyd.edu.au Wed Aug 12 08:46:09 1998 From: marwan at ee.usyd.edu.au (Marwan Jabri) Date: Wed, 12 Aug 1998 22:46:09 +1000 (EST) Subject: academic positions at the University of Sydney Message-ID: Lecturer/Senior Lecturer in Computer Engineering (2 positions) School of Electrical & Information Engineering The University of Sydney, Australia Reference No. A30/01 The School of Electrical & Information Engineering is expanding its teaching and research programs in the area of computer engineering, and invites applications for two continuing track positions in that area. Particular areas of interest include: neuromorphic engineering; VSLI systems; VLSI based architectures and applications; embedded hardware/software co-design; computer hardware and architectures; advanced digital engineering; and parallel and distributed processing. The existing academic staff in related areas have research interests in low power VLSI, neuromorphic engineering, artificial intelligence, real-time systems, biomedical systems, non linear and adaptive control, communications systems and image processing. Applicants at the Senior Lecturer level should have a PhD in Electrical or Computer Engineering. Applicants at the Lecturer level should have or should be expecting soon a PhD. PhD's in computer science with a substantial engineering background may be considered. The position requires commitment to leadership in research and teaching. Candidates must have a high level of research outcomes and would have the capability to teach undergraduate classes and supervise research students. Undergraduate teaching experience and experience of industrial applications are desirable. Membership of a University approved superannuation scheme is a condition of employment for new appointees. For further information contact Professor M A Jabri, on (+61-2) 9351 2240, fax: (+61-2) 9351 7209, email: , or the Head of Department, Professor D J Hill, on (+61-2) 9351 4647, fax: (+61-2) 9351 3847, email: . Salary: Senior Lecturer $57,610 - $66,429 pa. (increasing to $59,338 - $68,422 pa. on 1/10/98) Lecturer $47,029 - $55,848 pa. (increasing to $48,440 - $57,523 pa. on 1/10/98) (Level of appointment and responsibility will be commensurate with qualifications and experience) Academic staff at the School of Electrical and Information Engineering are currently entitled to receive a market loading (up to a maximum of 33.33% of base salary). The market loading scheme is expected to continue until the end of 1999 when it will be reviewed. Closing: 29 October 1998 Application Information ----------------------- No smoking in the workplace is University policy. Equal employment opportunity is University policy. Other than in exceptional circumstances all vacancies within the University are advertised in the Bulletin Board and on the World Wide Web. Intending applicants are encouraged to seek further information from the contact given before submitting a formal application. Application Method ------------------ Four copies of the application, quoting reference no., including curriculum vitae, list of publications and the names, addresses and fax numbers of five referees. Applications should be forwarded to: The Personnel Officer (College of Sciences and Technology), Carslaw Building, (F07) The University of Sydney NSW 2006 Australia The University reserves the right not to proceed with any appointment for financial or other reasons. From ruderman at salk.edu Wed Aug 12 13:12:00 1998 From: ruderman at salk.edu (Dan Ruderman) Date: Wed, 12 Aug 1998 10:12:00 -0700 (PDT) Subject: Preprint available on ICA of time-varying natural images Message-ID: The following preprint is available via the web: Independent component analysis of image sequences yields spatiotemporal filters similar to simple cells in primary visual cortex J.H. van Hateren and D.L. Ruderman Proc. R. Soc. Lond. B, in press Abstract Simple cells in primary visual cortex process incoming visual information with receptive fields localized in space and time, bandpass in spatial and temporal frequency, tuned in orientation, and commonly selective for the direction of movement. It is shown that performing independent component analysis on video sequences of natural scenes produces results with qualitatively similar spatiotemporal properties. Whereas the independent components of video resemble moving edges or bars, the independent component filters, i.e. the analogues of receptive fields, resemble moving sinusoids windowed by steady gaussian envelopes. Contrary to earlier ICA results on static images, which gave only filters at the finest possible spatial scale, the spatiotemporal analysis yields filters at a range of spatial and temporal scales. Filters centered at low spatial frequencies are generally tuned to faster movement than those at high spatial frequencies. http://hlab.phys.rug.nl/papers/pvideow.html From tho at james.hut.fi Wed Aug 12 14:23:47 1998 From: tho at james.hut.fi (Timo Honkela) Date: Wed, 12 Aug 1998 21:23:47 +0300 (EEST) Subject: Connectionist symbol processing: any progress? In-Reply-To: Message-ID: On Tue, 11 Aug 1998, Herbert L. Roitblat wrote: > I don't have answers, but I might add to the bibliography. Part of the > problem with symbol processing approaches is that they are not as truly > successful as they might pretend. The syntactic part of symbol processing > is easy and reliable, but the semantic part is still a muddle. In > English, for example, the words do not really have a systematic meaning as > Fodor and others would assert. (...) This point of view is most welcome: the view of symbol processing is often rather limited, e.g., to systematicity and structural issues. The area of natural language semantics and pragmatics is vast, and in my opinion, especially unsupervised connectionist learning paradigm, e.g., self-organizing map, provides a basis for modeling and understanding phenomena such as subjectivity of interpretation, intersubjectivity, meaning in context, negotiating meaning, emergence of categories not explicitly given a priori, adaptive prototypes, etc. (please see Timo Honkela: Self-Organizing Maps in Natural Language Processing, Helsinki University of Technology, PhD thesis, 1997; http://www.cis.hut.fi/~tho/thesis/). Adding to the list of references I would like to mention, e.g., MacWhinney, B. (1997). Cognitive approaches to language learning, chapter Lexical Connectionism. MIT Press. (see http://psyscope.psy.cmu.edu/Local/Brian/) Miikkulainen, R. (1997). Self-organizing feature map model of the lexicon. Brain and Language, 59:334-366. (see http://www.cs.utexas.edu/users/risto/) Also Peter G"ardenfors has written several relevant articles in this area (see http://lucs.fil.lu.se/Staff/Peter.Gardenfors/). Best regards, Timo Honkela ------------------------------------------------------------------- Timo Honkela, PhD Timo.Honkela at hut.fi http://www.cis.hut.fi/~tho/ Neural Networks Research Centre, Helsinki University of Technology P.O.Box 2200, FIN-02015 HUT, Finland Tel. +358-9-451 3275, Fax +358-9-451 3277 From jfeldman at ICSI.Berkeley.EDU Wed Aug 12 14:33:46 1998 From: jfeldman at ICSI.Berkeley.EDU (Jerry Feldman) Date: Wed, 12 Aug 1998 11:33:46 -0700 Subject: Connectionist symbol processing: any progress? Message-ID: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> Dave Touretsky asks how well we are doing at making neurally plausible models of human symbolic processes like natural language. Let's start at CMU. One of the driving sources of the "new connectionism" was the interactive activation model of McClelland and Rumelhart; Jay McC continues to work on this as well as other things. John Anderson's spreading activation ACT* models continue to attract (annual?) workshops. More broadly, the various basic ideas of connectionist modeling are playing an important (sometimes dominant) role in several fields that deal with language and symbolic behavior. For example, Elman nets continue to be a standard way to do models in Cognitive Psychology. The text and workbook on "Rethinking Innateness" by Elman, et.al. is a major force in Developmental Psychology. Spreading activation models underlie all priming work in Cognitive Psychology and Psycholinguistics. In Neuropsychology, Damasio's convergence zones continue to attract serious attention. Paul Smolensky's Harmony Theory (in a simplified form) has become a dominant paradigm in phonology and is beginning to play a large role in discussions of grammar - witness the long invited article in Science this year. Shastri's note lists a number of other relevant efforts. In Artificial Intelligence, Belief Networks have arguably replaced Symbolic Logic as the leading paradigm. The exact relation between Belief Networks and structured connectionist models remains to be worked out and this would be a good topic for discussion on this list. For a good recent example, see the (prize) paper by Srini Narayanan and Dan Jurafsky at CogSci98. It is true that none of this is much like Touretsky's early attempt at a holographic LISP and that there has been essentially no work along these lines for a decade. There are first order computational reasons for this. These can be (and have been) spelled out technically but the basic idea is straightforward - PDP (Parallel Distributed Processing) is a contradiction in terms. To the extent that representing a concept involves all of the units in a system, only one concept can be active at a time. Dave Rumelhart says this is stated somewhere in the original PDP books, but I forget where. The same basic point accounts for the demise of the physicists' attempts to model human memory as a spin glass. Distributed representations do occur in the brain and are useful in many tasks, conceptual representation just isn't one of them. The question of how the connectionist brain efficiently realizes (and learns) symbolic processes like language is one of the great intellectual problems of our time. I hope that people on this list will continue to contribute to its solution. -- Jerry Feldman From gutkin at cnbc.cmu.edu Wed Aug 12 16:58:44 1998 From: gutkin at cnbc.cmu.edu (Boris Gutkin) Date: Wed, 12 Aug 1998 16:58:44 -0400 (EDT) Subject: Paper on ISI variability Message-ID: Dear Colleagues, we wanted to bring to your attention our recently published paper on spike generating dynamics and ISI varibility in cortical neurons, which is available in postscript from: cnbc.cmu.edu in pub/user/gutkin the file is type1noise.ps Cheers, Boris Gutkin _______________________________ Neural Computation 10(5) Dynamics of membrane excitability determine inter-spike interval variability:a link between spike generation mechanisms and cortical spike trainstatistics. Boris S. Gutkin and G. Bard Ermentrout Program in Neurobiology and Dept. of Mathematics University of PIttsburgh Pittsburgh PA We propose a biophysical mechanism for the high interspike interval variability observed in cortical spike trains. The key lies in the non-linear dynamics of cortical spike generation which are consistent with type I membranes where saddle-node dynamics underlie excitability. [Rinzel '89]. We present a canonical model for type I membranes, the $\theta$-neuron. The $\theta$-neuron is a phase model whose dynamics reflect salient features of type I membranes. This model generates highly var iable spike trains (coefficient of variation (cv) above 0.6) when brought to firing by noisy inputs. This happens because the timing of spikes for a type I excitable cell is exquisitely sensitive to the amplitude of the supra-threshold stimulus pulses. A noisy input current, giving random amplitude "kicks" to the cell, evokes highly irregular firing across a wide range of firing rates. On the other hand, an intrinsically oscillating cell gives regular spike trains. We corroborate the above results with simulations of Morris-Lecar (M-L) neural model with random synaptic inputs. Type I M-L yields high cv's. When this model is modified to have type II dynamics (periodicity arises via a Hopf bifurcation), it gives regular spike trains (cv below 0.3). Our results suggest that the high cv values such as those observed in cortical spike trains are an intrinsic characteristic of type Imembrane driven to firing by "random" inputs. In contrast, neural oscillators or neurons exhibiting type II excitability shouldproduce regular spike trains. From brianh at shn.net Wed Aug 12 20:31:22 1998 From: brianh at shn.net (Brian Hazlehurst) Date: Wed, 12 Aug 1998 17:31:22 -0700 Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> (message from Jerry Feldman on Wed, 12 Aug 1998 11:33:46 -0700) Message-ID: Dave Touretsky's question about the state of research in connectionist symbol processing has elicited interesting responses showing that some interpret the problem as computational, some see the problem as biological, while others see the problem as more narrowly neurological. To this mix, let me add a view that the problem of symbols is (also) social and historical. If this idea interests you, I invite you to read a recent publication: The Emergence of Propositions from the Co-ordination of Talk and Action in a Shared World Brian Hazlehurst brianh at shn.net Edwin Hutchins hutchins at cogsci.ucsd.edu Abstract: We present a connectionist model that demonstrates how propositional structure can emerge from the interactions among members of a community of simple cognitive agents. We first describe a process in which agents coordinating their actions and verbal productions with each other in a shared world leads to the development of propositional structures. We then present a simulation model which implements the process for generating propositions from scratch. We report and discuss the behavior of the model in terms of its ability to produce three properties of propositions: (1) a coherent lexicon characterized by shared form-meaning mappings; (2) conventional structure in the sequence of forms; (3) the predication of spatial facts. We show that these properties do not emerge when a single individual learns the task alone and conclude that the properties emerge from the demands of the communication task rather than from anything inside the individual agents. We then show that the shared structural principles can be described as a grammar, and discuss the implications of this demonstration for theories concerning the origins of the structure of language. In: K. Plunket (Ed.), Language Acquisition and Connectionism. Special Issue of *Language and Cognitive Processes*, 1998, 13 (2/3), 373-424. ///////////////////////////////////////////////////// Brian Hazlehurst Chief Scientist Sapient Health Network brianh at shn.net www.shn.net ///////////////////////////////////////////////////// From goldfarb at unb.ca Wed Aug 12 20:52:09 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Wed, 12 Aug 1998 21:52:09 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: On Tue, 11 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? > ................................................................. > So I concluded that connectionist symbol processing had reached a > plateau, and further progress would have to await some revolutionary > new insight about representations. > .................................................................... > Today, Michael Arbib is working on the second edition of his handbook, > and I've been asked to update my article on connectionist symbol > processing. Is it time to write an obituary for a research path that > expired because the problems were too hard for the tools available? > Or are there important new developments to report? > > I'd love to hear some good news. David, I'm afraid, I haven't got the "good news", but, who knows, some good may still come out of it. About 8-9 years ago, soon after the birth of the connectionists mailing list, there was a discussion somewhat related to the present one. I recall stating, in essence, that it doesn't make sense to talk about the connectionist symbol processing simply because the connectionist representation space--the vector space over the reals--by its very definition (recall the several axioms that define it) doesn't allow one to "see" practically any symbolic operations, and therefore one cannot construct, or learn, in it (without cheating) the corresponding inductive class representation. I have been reluctant to put a substantial effort into a formal proof of this statement since I believe (after so many years of working with the symbolic data) that it is, in some sense, quite obvious (see also [1-3]). Let me try, again, to clarify the above. Hacking apart, the INPUT SPACE of a learning machine must be defined axiomatically, as is the now universal practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of the learning machine, i.e. the bias related to the class of permitted object operations (compare with the central CS concept of abstract data type). There could be, of course, other, additional, biases related to different classes of learning algorithms each operating, however, in the SAME input space (compare, for example, with the Chomsky overall framework for languages and its various subclasses of languages). It appears that the present predicament is directly related to the fact that, historically, in mathematics, there was, essentially, no work done on the formalization of the concept of "symbolic" representation space. Apparently, such spaces are nontrivial generalizations of the classical representation spaces, the latter being used in all sciences and have evolved from the "numeric" spaces. I emphasize "in mathematics" since logic (including computability theory) does not deal with the representation spaces, where the "representation space" could be thought of as a generalization of the concept of MEASUREMENT SPACE. By the way, "measurement" implies the presence of some distance measure(s) defined on the corresponding space, and that is the reason why the study of such spaces belongs to the domain of mathematics rather than logic. It appears to us now that there are fundamental difference between the two classes of "measurement spaces": the "symbolic" and the "numeric" spaces (see my home page). To give you at least some idea about the differences, I am presenting below the "symbolic solution" (without the learning algorithm) to the generalized parity problem, the problem quite notorious within the connectionist community. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ THE PARITY CLASS PROBLEM The alphabet: A = {a, b} ------------ Input set S (i.e. the input space without the distance function): The set ----------- of strings over A. The parity class C: The set of strings with an even number of b's. ------------------ Example of a positive training set C+: aababbbaabbaa ------------------------------------- baabaaaababa abbaaaaaaaaaaaaaaa bbabbbbaaaaabab aaa Solution to the parity problem, i.e. inductive (parity) class representation: ----------------------------------------------------------------------------- One element from C+, e.g. 'aaa', plus the following 3 weighted operations operations (note that the sum of the weights is 1) deletion/insertion of 'a' (weight 0) deletion/insertion of 'b' (weight 1) deletion/insertion of 'bb' (weight 0) This means, in particular, that the DISTANCE FUNCTION D between any two strings from the input set S is now defined as the shortest weighted path (based on the above set of operations) between these strings. The class is now defined as the set of all strings in the measurement space (S,D) whose distance from aaa is 0. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Why do, then, so many people work on the "connectionist symbol processing"? On the one hand, many of us feel (correctly, in my opinion) that the symbolic representation is a very important topic. On the other hand, and I am quite sure of that, if we look CAREFULLY at any corresponding concrete implementation, we would see that in order to "learn" the chosen symbolic class one had to smuggle into the model, in some form, some additional structure "equivalent" to the sought symbolic structure (e.g. in the form of the recurrent ANN's architecture). This is, again, due to the fact that in the vector space one simply cannot detect (in a formally reasonable manner) any non-vector-space operations. [1] L. Goldfarb, J. Abela, V.C. Bhavsar, V.N. Kamat, Can a vector space based learning model discover inductive class generalization in a symbolic environment? Pattern Recognition Letters, 16 (7), 1995, pp. 719-726. [2] L. Goldfarb and J. Hook, Why classical models for pattern recognition are not pattern recognition models, to appear in Proc. Intern. Conf. on Advances in Pattern Recognition (ICAPR), ed. Sameer Singh, Plymouth, UK, 23-25 Nov. 1998, Springer. [3] V.C. Bhavsar, A.A. Ghorbany, L. Goldfarb, Artificial neural networks are not learning machines, Tech. Report, Faculty of Computer Science, U.N.B. --Lev Goldfarb http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm From Lakhmi.Jain at unisa.edu.au Wed Aug 12 21:55:23 1998 From: Lakhmi.Jain at unisa.edu.au (Lakhmi Jain) Date: Thu, 13 Aug 1998 11:25:23 +0930 Subject: Please consider to circulate widely Message-ID: <5DA42F8A3C18D111979400AA00DD609D012F62A7@EXSTAFF3.Levels.UniSA.Edu.Au> Please consider to circulate widely INTERNATIONAL SERIES ON COMPUTATIONAL INTELLIGENCE CRC Press, USA Series Editor-in-chief: L.C. Jain Proposals from authors and editors for future volumes are welcome. Knowledge-Based computational intelligence techniques involve the use of computers to enable machines to simulate human performance. The prominent paradigms used include expert systems, artificial neural networks, fuzzy logic, evolutionary computing techniques and chaos engineering. These knowledge-based computational intelligence techniques have generated tremendous interest among scientists and application engineers due to a number of benefits such as generalization, adaptation, fault tolerance and self-repair, self-organization and evolution. Successful demonstration of the applications of knowledge-based systems theories will aid scientists and engineers in finding sophisticated and low cost solutions to difficult problems. This new series is unique in that it presents novel designs and applications of knowledge-based computational intelligence techniques in engineering, science and related fields. L.C. Jain, BE(Hons), ME, PhD, Fellow IE(Aust) Founding Director Knowledge-Based Intelligent Engineering Systems Cente University of South Australia Adelaide Mawson Lakes, SA 5095 Australia L.Jain at unisa.edu.au Please send your proposal including Title of the book General description:what need does this book fill? List 5 key features of the book Table of contents Preface Primary and secondary markets: Who will buy it? Competing and/or related books and publishers: What are the advantages of your book over the competition? A brief CV or biography, with relevant publications Expected manuscript delivery date From makowski at neovista.com Thu Aug 13 18:00:05 1998 From: makowski at neovista.com (Greg Makowski) Date: Thu, 13 Aug 1998 15:00:05 -0700 Subject: 4+ immedate job openings for Knowledge Discovery Engineers Message-ID: <35D361E5.32CC4C8B@neovista.com> 4+ immediate job openings Knowledge Discovery Engineer (KDE) for NeoVista Software Job Description In this position, candidates will be responsible for the implementation and technical success of business solutions through the application of NeoVista's Decision Series or vertical software products. NeoVista puts business value of customer solutions first. Candidates who will be strongly preferred are those with business deployment experience in at least one, and preferably more, of the following analytic techniques: neural networks, decision trees (i.e. C5.0, CART or CHAID), naove Bayes, clustering or association rules. Strong experience solving business problems with modeling process, regression, statistics or other analytic numeric methods is also relevant for this position. KDE's should to be proficient in programming and manipulating large data systems as part of the analysis. Analytic projects may involve 10-50 GB of data, and may be contained in a half dozen tables to as many as 75 tables. Strong experience in either SQL or SAS is a core skill set. Experience in Unix scripting, Java, C++, or C can be helpful. Must have BA/BS or equivalent with 5+ years of experience in the computer industry, although most of our KDE's have an MS or Ph.D. One or more years of experience with large scale data mining/knowledge discovery/pattern recognition projects as an implementor is desirable. Helpful experience includes analytic or business experience in our vertical markets. Our vertical markets include retail (supply chain optimization problems), insurance, banking or telco (Customer Relationship Marketing problems such as customer retention, segmentation or fraud). Experience in targeted marketing analytics or deployment is useful. Candidates are not required to have such vertical market experience. Other helpful experience includes: presentation, written and client communication skills, project management or proposal writing. Periodic travel may be required for project presentations, or for on site analytic work. This job description describes a "center" of the set of skills useful for this position. It may be to the advantage for candidates to send a cover letter detailing specific experiences, strengths and weakness to fill a KDE role. Preferred locations for employment are: Cupertino, CA, Dallas, Atlanta or other major metropolitan areas. NeoVista Software has an immediate need to hire multiple KDE's. Pre-Sales Knowledge Discovery Engineer (KDE) for NeoVista Software Job Description People in this position work with a sales executive to listen to prospects, understand their business problems, and to propose compelling KDD solution. Giving presentations and or demos to small groups of business VP's and their technical advisors is a common activity. It is important to be able to communicate effectively to business executives and individuals that may be technical in either IT or analytic methods. Travel for 1-3 day trips twice a month may not be uncommon. Once the pre-sales KDE proposes a solution, they may write for a proposal the project plan, description of major tasks, project deliverables and possible business deployment strategies. They may also play an account management role with regard to technical issues. The pre-sales KDE should be able to meet the majority of the requirements for a KDE, however the distinction between KDE's and pre-sales KDE's occurs in degrees. Preferred locations for employment are: Cupertino, CA, Dallas, Atlanta or other major metropolitan areas. NeoVista Software has immediate openings for multiple pre-sales KDE's. Application Directions: Please email or fax a resume and a cover letter addressed to me. I will be travelling the week of 8/17-21/98, so during this time please send both an email and fax. Greg Makowski (pre-sales KDE) NeoVista Software, Inc. 10710 North Tantau Avenue Cupertino, CA 95014 makowski at neovista.com (Word or text resume) (408) 777-2930 fax (408) 343-4239 phone www.neovista.com Thank you for your time, Greg -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 427 bytes Desc: Card for Greg Makowski Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/c87d5416/vcard.vcf From rickert at cs.niu.edu Thu Aug 13 13:56:18 1998 From: rickert at cs.niu.edu (Neil W Rickert) Date: Thu, 13 Aug 1998 14:56:18 -0300 Subject: Connectionist symbol processing: any progress? Message-ID: <12011.903026162@ux.cs.niu.Edu> On Wed, 12 Aug 1998 Lev Goldfarb wrote: > Hacking apart, the INPUT SPACE of >a learning machine must be defined axiomatically, as is the now universal >practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of >the learning machine, i.e. the bias related to the class of permitted >object operations (compare with the central CS concept of abstract data >type). Why? This seems completely wrong to me. It seems to me that what you have stated can be paraphrased as: The knowledge that the learning machine is to acquire must be pre-encoded into the machine as (implicit or explicit) innate structure. But, if my paraphrase is correct, then one wonders why such a machine warrants the name "learning machine." Presumably we have very different ideas as to what is learning. I take it that human learning is the process of discovering the nature of our world. More generally I take it that, at its most fundamental level, learning is the process of discovering the structure of the input space, a process which may require the eventual production of an axiomatization (a set of natural laws) for that space. From goldfarb at unb.ca Thu Aug 13 15:09:27 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Thu, 13 Aug 1998 16:09:27 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <12011.903026162@ux.cs.niu.Edu> Message-ID: On Thu, 13 Aug 1998, Neil W Rickert wrote: > On Wed, 12 Aug 1998 Lev Goldfarb wrote: > > > Hacking apart, the INPUT SPACE of > >a learning machine must be defined axiomatically, as is the now universal > >practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of > >the learning machine, i.e. the bias related to the class of permitted > >object operations (compare with the central CS concept of abstract data > >type). > > Why? This seems completely wrong to me. > > It seems to me that what you have stated can be paraphrased as: > > The knowledge that the learning machine is to acquire must be > pre-encoded into the machine as (implicit or explicit) innate > structure. Neil, Your paraphrase is wrong: my statement refers to the fact that the fundamental bias related to the (evolutionary) structure of the environment must be postulated. [Again, compare with the concept of abstract data type (ADT), which has been found to be ABSOLUTELY indispensable in computer science for dealing with various type of data]. > But, if my paraphrase is correct, then one wonders why such a machine > warrants the name "learning machine." > > Presumably we have very different ideas as to what is learning. I > take it that human learning is the process of discovering the nature > of our world. More generally I take it that, at its most fundamental > level, learning is the process of discovering the structure of the > input space, a process which may require the eventual production of > an axiomatization (a set of natural laws) for that space. I am suggesting "that, at its most fundamental level, learning is the process of discovering" the inductive class structure and not "the structure of the input space". The latter would be simply impossible to discover unless one is equipped with exactly the same bias. If the machine is not equipped with the "right" evolutionary bias (about the COMPOSITIONAL STRUCTURE OF OBJECTS in the universe) hardly any reliable inductive learning is possible. On the other hand, if it is equipped with the right bias (about the overall structure of objects) then the inductive learning could proceed in the biologically familiar manner. I believe that the "tragedy" of the connectionist symbol processing is directly related to the fact that the vector space bias is structurally too simple/restrictive and has hardly anything to do with the symbolic bias, the latter being a simple example of the "right" structural evolutionary bias of the universe. Cheers, Lev From ruppin at math.tau.ac.il Thu Aug 13 15:10:49 1998 From: ruppin at math.tau.ac.il (Eytan Ruppin) Date: Thu, 13 Aug 1998 22:10:49 +0300 (GMT+0300) Subject: Connectionist symbol processing: any progress? Message-ID: <199808131910.WAA02170@gemini.math.tau.ac.il> Jerry Feldman writes: > - PDP (Parallel Distributed Processing) is a contradiction in terms. > To the extent that representing a concept involves all of > the units in a system, only one concept can be active at > a time. Dave Rumelhart says this is stated somewhere in > the original PDP books, but I forget where. The same > basic point accounts for the demise of the physicists' > attempts to model human memory as a spin glass. There is really no ``contradiction in terms'' here. Indeed, associative memory networks (or attractor neural networks) can activate only one stored concept at a time. However, such networks should not be viewed as representing the whole brain but should be (and indeed are) viewed as representing modular cortical structures such as columns. Given this interpretation, the problem is resolved; If these networks are sufficiently loosely coupled then many patterns can be activated together, resulting in complex and rich dynamics. We should be careful before discarding our models on false grounds. We have too few viable models that can serve as paradigms of information processing. Best wishes, Eytan Ruppin. From mitsu at ministryofthought.com Thu Aug 13 16:11:14 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Thu, 13 Aug 1998 13:11:14 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D34861.CCE67E25@ministryofthought.com> I'm not quite sure I agree with your analysis. Since I haven't looked at it in great detail, I present this as a tentative critique of your presentation. Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo their finite size, they are Turing equivalent), then unless you are saying that your represention cannot be implemented on a Turing machine, then it is clearly NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic representations. Not having looked at your scheme in detail, of course, I don't know whether your scheme somehow is unimplementable even on a Turing machine, but it seems to me you must not be claiming this. You seem to be basing your argument on the notion that the input space to a recurrent ANN is a set of numbers, which you interpret as the coordinates of a vector. However, this is only a kind of vague analogy, since the field operations of the vector space (addition, multiplication, etc.) have no clear meaning on the input space. "Adding" two input vectors does not necessarily result in anything meaningful except in the sense that the recurrent ANN to be useful must be locally stable with respect to small variations in the input. However, the actual structure or metric of the input space is in some sense determined not a priori but by the state of the recurrent ANN itself, and can change over time both as a result of training and as a result of iteration. The input space is numbers, yes, but that doesn't make it a vector space. For example, what properties of the input would be preserved if I, say, added the vector (10^25, 10^25, ..) to the input? If it is a "vector space" then that operation would yield something sensible, some symmetries, and yet it obviously does not. Thus, while I sympathize with your claim that the vector field of R(n) does not admit to the structure necessary to make visible much symbolic structure, this in itself does not doom connectionist symbol processing by any means. Your argument does have weight when applied to a single-layer perceptron, which is, after all, just a thresholded/distorted linear transformation. Although it seemed to take the early connectionist community by surprise, it should be no surprise at all that a single-layer perceptron cannot learn the parity problem, because obviously the parity problem is not linearly separable, and how could any linear discriminator possibly learn a non-linearly-separable problem? However, we do not live in a world of single-layer perceptrons. Because networks are more complex than this, arguments about the linearity of the input space seem to me rather irrelevant. I suspect you mean something else, however. I think the intuitive point you are perhaps trying to make is that symbolic representations are arbitrarily nestable (recursively recombinable), and an input space which consists of a fixed number of dimensions cannot handle recursive combinations. However, one can use time-sequence to get around this problem (as we all are doing when we read and write for example). Rather than make our eyes, for example, capable of handling arbitrarily recombinable input all at once, we sequence the input to our eyes by reading material over time. The same trick can be used with recurrent networks for example. Mitsu Lev Goldfarb wrote: > David, > I'm afraid, I haven't got the "good news", but, who knows, some good may > still come out of it. > > About 8-9 years ago, soon after the birth of the connectionists mailing > list, there was a discussion somewhat related to the present one. I recall > stating, in essence, that it doesn't make sense to talk about the > connectionist symbol processing simply because the connectionist > representation space--the vector space over the reals--by its very > definition (recall the several axioms that define it) doesn't allow one to > "see" practically any symbolic operations, and therefore one cannot > construct, or learn, in it (without cheating) the corresponding inductive > class representation. I have been reluctant to put a substantial effort > into a formal proof of this statement since I believe (after so many years > of working with the symbolic data) that it is, in some sense, quite > obvious (see also [1-3]). > > Let me try, again, to clarify the above. Hacking apart, the INPUT SPACE of > a learning machine must be defined axiomatically, as is the now universal > practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of > the learning machine, i.e. the bias related to the class of permitted > object operations (compare with the central CS concept of abstract data > type). There could be, of course, other, additional, biases related to > different classes of learning algorithms each operating, however, in the > SAME input space (compare, for example, with the Chomsky overall framework > for languages and its various subclasses of languages). > > It appears that the present predicament is directly related to the fact > that, historically, in mathematics, there was, essentially, no work done > on the formalization of the concept of "symbolic" representation space. > Apparently, such spaces are nontrivial generalizations of the classical > representation spaces, the latter being used in all sciences and have > evolved from the "numeric" spaces. I emphasize "in mathematics" since > logic (including computability theory) does not deal with the > representation spaces, where the "representation space" could be thought > of as a generalization of the concept of MEASUREMENT SPACE. By the way, > "measurement" implies the presence of some distance measure(s) defined on > the corresponding space, and that is the reason why the study of such > spaces belongs to the domain of mathematics rather than logic. > > It appears to us now that there are fundamental difference between the two > classes of "measurement spaces": the "symbolic" and the "numeric" spaces > (see my home page). To give you at least some idea about the differences, > I am presenting below the "symbolic solution" (without the learning > algorithm) to the generalized parity problem, the problem quite notorious > within the connectionist community. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > THE PARITY CLASS PROBLEM > > The alphabet: A = {a, b} > ------------ > > Input set S (i.e. the input space without the distance function): The set > ----------- > of strings over A. > > The parity class C: The set of strings with an even number of b's. > ------------------ > > Example of a positive training set C+: aababbbaabbaa > ------------------------------------- baabaaaababa > abbaaaaaaaaaaaaaaa > bbabbbbaaaaabab > aaa > > Solution to the parity problem, i.e. inductive (parity) class representation: > ----------------------------------------------------------------------------- > > One element from C+, e.g. 'aaa', plus the following 3 weighted operations > operations (note that the sum of the weights is 1) > deletion/insertion of 'a' (weight 0) > deletion/insertion of 'b' (weight 1) > deletion/insertion of 'bb' (weight 0) > > This means, in particular, that the DISTANCE FUNCTION D between any two > strings from the input set S is now defined as the shortest weighted path > (based on the above set of operations) between these strings. The class is > now defined as the set of all strings in the measurement space (S,D) whose > distance from aaa is 0. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Why do, then, so many people work on the "connectionist symbol > processing"? On the one hand, many of us feel (correctly, in my opinion) > that the symbolic representation is a very important topic. On the other > hand, and I am quite sure of that, if we look CAREFULLY at any > corresponding concrete implementation, we would see that in order to > "learn" the chosen symbolic class one had to smuggle into the model, in > some form, some additional structure "equivalent" to the sought symbolic > structure (e.g. in the form of the recurrent ANN's architecture). This is, > again, due to the fact that in the vector space one simply cannot detect > (in a formally reasonable manner) any non-vector-space operations. > > > [1] L. Goldfarb, J. Abela, V.C. Bhavsar, V.N. Kamat, Can a vector space > based learning model discover inductive class generalization in a > symbolic environment? Pattern Recognition Letters, 16 (7), 1995, pp. > 719-726. > > [2] L. Goldfarb and J. Hook, Why classical models for pattern recognition > are not pattern recognition models, to appear in Proc. Intern. Conf. > on Advances in Pattern Recognition (ICAPR), ed. Sameer Singh, > Plymouth, UK, 23-25 Nov. 1998, Springer. > > [3] V.C. Bhavsar, A.A. Ghorbany, L. Goldfarb, Artificial neural networks > are not learning machines, Tech. Report, Faculty of Computer Science, > U.N.B. > > > --Lev Goldfarb > > http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm From bryan at cog-tech.com Thu Aug 13 16:57:14 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Thu, 13 Aug 1998 16:57:14 -0400 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> (Dave_Touretzky@cs.cmu.edu) Message-ID: <199808132057.QAA14403@cti2.cog-tech.com> Hello, We are currently engaged in cognitive modeling of reflexive (recognitional) and reflective (metacognitive) behaviors. To this end, we have used a structured connectionist model of inferential long-term memory, with good results. The reflexive system is based on the Shruti model proposed by Shastri and Ajjanagadde (1993, 1996). Working with Shastri, we have extended the model to incorporate supervised learning, priming, etc. We are currently working on an integration of belief and utility within this model. The resulting network will be able to not only reflexively construct interpretations of evidence from its environment, but will be able to reflexively plan and execute responses at multiple levels of abstraction as well. The reflexive system is coupled to a metacognitive system, which is responsible for directing the focus of attention, making and testing assumptions, identifying and responding to conflicting interpretations and/or goals, locating unreliable conclusions, and managing risk. The perspective that only a fully distributed representation represents a connectionist solution to structure, variable binding, etc. is perhaps, what warrents challange. The brain is by no measure without internal structure on both gross and very detailed levels. While research has yet to identify rich mechanisms for dealing with structured representations and inference within a fully distributed representation, it has also yet to fully explore the potential of specialized neural structures for systematic reasoning. -- bryan thompson Cognitive Technologies, Inc. bryan at cog-tech.com References: Thompson, B.B., Cohen, M.S., Freeman, J.T. (1995). Metacognitive behavior in adaptive agents. In Proceedings of the World Congress on Neural Networks, (IEEE, July). Cohen, Marvin S. and Freeman, Jared T. (1996). Thinking naturally about uncertainty. In Proceedings of the Human Factors & Ergonomics Society, 40th Annual Meeting. Santa Monica, CA: Human Factors Society. From mike at cns.bu.edu Thu Aug 13 22:00:20 1998 From: mike at cns.bu.edu (Michael Cohen) Date: Thu, 13 Aug 1998 22:00:20 -0400 Subject: Connectionist symbol processing: any progress? References: <199808131910.WAA02170@gemini.math.tau.ac.il> Message-ID: <35D39A34.417A8D4A@cns.bu.edu> Eytan Ruppin wrote: > There is really no ``contradiction in terms'' here. Indeed, associative > memory networks (or attractor neural networks) can activate only one > stored concept at a time. However, such networks should not be viewed as > representing the whole brain but should be (and indeed > are) viewed as representing modular cortical structures such as columns. > Given this interpretation, the problem is resolved; If these networks are > sufficiently loosely coupled then many patterns can be > activated together, resulting in complex and rich dynamics. > > We should be careful before discarding our models on false grounds. We have > too few viable models that can serve as paradigms of information processing. > > Best wishes, > > Eytan Ruppin. I think the real question is what substantial validated progress has been made over anabove Formal Language // Transformational Grammar // Standard Artificial Intelligence on aspects of human language processing or parsing via connectionists methods. Have we better understood technological projects such as machine translation or semantic processing or have hard experimental evidence that these techniques are valuable in understanding what humans do. If not other ideas had best be persued, if so then we should keep on plugging away at the problem using these techniques. Its not whether in principle something can be stated using a network automaton. Its whether the language is good for the problem at hand. Surely Thermodynamics is Turing Computable, however Turing Machines alas are good in and of themselves for almost nothing save producing Models of Computability in which to guage complexity of Algorithms. --mike -- Michael Cohen mike at cns.bu.edu Associate Professor, Center for Adaptive Systems Work: 677 Beacon, Street, Rm313 Boston, Mass 02115 Home: 25 Stearns Rd, #3 Brookline, Mass 02146 Tel-Work: 617-353-9484 Tel-Home:617-353-7755 From r.gayler at psych.unimelb.edu.au Fri Aug 14 07:57:14 1998 From: r.gayler at psych.unimelb.edu.au (Ross Gayler) Date: Fri, 14 Aug 1998 21:57:14 +1000 Subject: Connectionist symbol processing: any progress? Message-ID: <3.0.32.19980814214225.00699820@myriad.unimelb.edu.au> At 11:33 12/08/98 -0700, Jerry Feldman wrote: .. > It is true that none of this is much like Touretsky's >early attempt at a holographic LISP and that there has >been essentially no work along these lines for a decade. >There are first order computational reasons for this. >These can be (and have been) spelled out technically >but the basic idea is straightforward - PDP (Parallel >Distributed Processing) is a contradiction in terms. To >the extent that representing a concept involves all of >the units in a system, > only one concept can be active at a time. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Dave Rumelhart says this is stated somewhere in >the original PDP books, but I forget where. The same >basic point accounts for the demise of the physicists' >attempts to model human memory as a spin glass. >Distributed representations do occur in the brain and >are useful in many tasks, conceptual representation just >isn't one of them. .. I would like to see where it has been "spelled out technically" that in a connectionist system "only one concept can be active at a time", because there must be some false assumptions in the proof. This follows from the fact that the systems developed by, for example, Smolensky, Kanerva, Plate, Gayler, and Halford et al *depend* on the ability to manipulate multiple superposed representations, and they actually work. I do accept that > It is true that none of this is much like Touretsky's >early attempt at a holographic LISP and partially accept that there has >been essentially no work along these lines for a decade. but explain it by: 1) Touretzky's work was an important demonstration of technical capability but not a serious attempt at a cognitive architecture. There is no reason to extend that particular line of work. 2) Although the outer-product architectures can (and have) been used with weight learning procedures, such as backpropagation, one of their major attractions is that so much can be achieved without iterative learning. To pursue this line of research requires the power to come from the architecture rather than an optimisation algorithm and a few thousand degrees of freedom. Therefore, this line of research is much less likely to produce a publishable result in a given time frame for a fixed effort (because you can't paper over the gaps with a few extra df). 3) The high-risk, high-effort nature of research into outer-product cognitive architectures without optimisation algorithms makes it unattractive to most researchers. You can't give a problem like this to a PhD student because you don't know the probability of a publishable result. The same argument applies to grant applications. The rational researcher is better advised to attack a more obviously soluble problem. So, I partially disagree with the statement that there has been >essentially no work along these lines for a decade. because there has been related (more cognitively focussed) work proceeding for the last decade. It has just been relatively quiet and carried out by a few people who can afford to take on a high effort, high risk project. Cheers, Ross Gayler From goldfarb at unb.ca Fri Aug 14 09:58:44 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Fri, 14 Aug 1998 10:58:44 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D34861.CCE67E25@ministryofthought.com> Message-ID: On Thu, 13 Aug 1998, Mitsu Hadeishi wrote: > I'm not quite sure I agree with your analysis. Since I haven't looked at it in > great detail, I present this as a tentative critique of your presentation. > > Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo > their finite size, they are Turing equivalent), then unless you are saying that > your represention cannot be implemented on a Turing machine, then it is clearly > NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic > representations. Not having looked at your scheme in detail, of course, I don't > know whether your scheme somehow is unimplementable even on a Turing machine, but > it seems to me you must not be claiming this. Yes, I'm not claiming this. Moreover, I'm not discussing the SIMULATIONAL, or computing, power of a learning model, since in a simulation of a Turing machine one doesn't care how the actual "parameters" are constructed based on a small training set, i.e. no learning is involved. Many present (and future) computational models are "Turing equivalent". That doesn't make them learning models, does it? In other words, as I mentioned in my original posting, IF YOU KNOW THE SYMBOLIC CLASS STRUCTURE of course you can simulate, or encode, it on many machines (similar quote straight from the horse's mouse i.e. from Siegelmann and Sontag paper "On the computational power of neural nets" , section 1.2: "Many other types of 'machines' may be used for universality"). Again, the discovery of the symbolic class structure is a fundamentally different matter, much less trivial than just a simple encoding of this structure using any other, including numeric, structure. > You seem to be basing your argument on the notion that the input space to a > recurrent ANN is a set of numbers, which you interpret as the coordinates of a > vector. However, this is only a kind of vague analogy, since the field > operations of the vector space (addition, multiplication, etc.) have no clear > meaning on the input space. "Adding" two input vectors does not necessarily > result in anything meaningful except in the sense that the recurrent ANN to be > useful must be locally stable with respect to small variations in the input. > However, the actual structure or metric of the input space is in some sense > determined not a priori but by the state of the recurrent ANN itself, and can > change over time both as a result of training and as a result of iteration. The > input space is numbers, yes, but that doesn't make it a vector space. For > example, what properties of the input would be preserved if I, say, added the > vector (10^25, 10^25, ...) to the input? If it is a "vector space" then that > operation would yield something sensible, some symmetries, and yet it obviously > does not. Thus, while I sympathize with your claim that the vector field of R(n) > does not admit to the structure necessary to make visible much symbolic > structure, this in itself does not doom connectionist symbol processing by any > means. It appears that something VERY BASIC is missing from the above description: How could a recurrent net learn without some metric and, as far as I know, some metric equivalent to the Euclidean metric? (All "good' metrics on a finite-dimensional vector space are equivalent to the Euclidean, see [1] in my original posting.) > Your argument does have weight when applied to a single-layer perceptron, which > is, after all, just a thresholded/distorted linear transformation. Although it > seemed to take the early connectionist community by surprise, it should be no > surprise at all that a single-layer perceptron cannot learn the parity problem, > because obviously the parity problem is not linearly separable, and how could any > linear discriminator possibly learn a non-linearly-separable problem? However, > we do not live in a world of single-layer perceptrons. Because networks are more > complex than this, arguments about the linearity of the input space seem to me > rather irrelevant. I suspect you mean something else, however. Yes, I do. > I think the intuitive point you are perhaps trying to make is that symbolic > representations are arbitrarily nestable (recursively recombinable), and an input > space which consists of a fixed number of dimensions cannot handle recursive > combinations. However, one can use time-sequence to get around this problem (as > we all are doing when we read and write for example). Rather than make our eyes, > for example, capable of handling arbitrarily recombinable input all at once, we > sequence the input to our eyes by reading material over time. The same trick can > be used with recurrent networks for example. Mitsu, I'm not making just this point. The main point I'm making can be stated as follows. Inductive learning requires some object dissimilarity and/or similarity, measure(s). The accumulated mathematical experience strongly suggests that the distance in the input space must be consistent with the underlying operational, or compositional, structure of the chosen object representation (e.g. topological group, topological vector space, etc). It turns out that while the classical vector space (because of the "simple" compositional structure of its objects) allows essentially one metric consistent with the underlying algebraic structure [1], each symbolic "space" (e.g. strings, trees, graphs) allows infinitely many of them. In the latter case, the inductive learning becomes the learning of the corresponding class distance function (refer to my parity example). Moreover, since some noise is always present in the training set, I cannot imagine how RELIABLE symbolic inductive class structure can be learned from a SMALL training set without the right symbolic bias and without the help of the corresponding symbolic distance measures. Cheers, Lev From pierre at mbfys.kun.nl Fri Aug 14 06:20:16 1998 From: pierre at mbfys.kun.nl (Pierre v.d. Laar) Date: Fri, 14 Aug 1998 12:20:16 +0200 Subject: Pruning Using Parameter and Neuronal Metrics Message-ID: <35D40F5F.A71E6B07@mbfys.kun.nl> Dear Connectionists, The following article which has been accepted for publication in Neural Computation can now be downloaded from our ftp-server as ftp://ftp.mbfys.kun.nl/snn/pub/reports/vandeLaar.NC98.ps.Z Yours sincerely, Pierre van de Laar Pruning Using Parameter and Neuronal Metrics written by Pierre van de Laar and Tom Heskes Abstract: In this article, we introduce a measure of optimality for architecture selection algorithms for neural networks: the distance from the original network to the new network in a metric that is defined by the probability distributions of all possible networks. We derive two pruning algorithms, one based on a metric in parameter space and another one based on a metric in neuron space, which are closely related to well-known architecture selection algorithms, such as GOBS. Furthermore, our framework extends the theoretically range of validity of GOBS and therefore can explain results observed in previous experiments. In addition, we give some computational improvements for these algorithms. FTP INSTRUCTIONS unix% ftp ftp.mbfys.kun.nl Name: anonymous Password: (use your e-mail address) ftp> cd snn/pub/reports/ ftp> binary ftp> get vandeLaar.NC98.ps.Z ftp> bye unix% uncompress vandeLaar.NC98.ps.Z unix% lpr vandeLaar.NC98.ps From mitsu at ministryofthought.com Fri Aug 14 14:44:07 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Fri, 14 Aug 1998 11:44:07 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D48576.7EFA0648@ministryofthought.com> Lev, Okay, so we agree on the following: Recurrent ANNs have the computational power required. The only thing at issue is the learning algortihm. >How could a recurrent net learn without some metric and, as >far as I know, some metric equivalent to the Euclidean metric? Your arguments so far seem to be focusing on the "metric" on the input space, but this does not in itself mean anything at all about the metric of the learning algorithm as a whole. Clearly the input space is NOT a vector space in the usual sense of the word, at least if you use the metric which is defined by the whole system (learning algortihm, error measure, state of the network). So, what you are now saying is that the metric must be equivalent (rather than equal to) to the Euclidean metric: you do not define what you mean by this. The learning "metric" in the connectionist paradigm changes over time: it is a function of the structure of the learning algorithm and the state of the network, as I mentioned above. The only sense in which the metric is "equivalent" to the Euclidean metric is locally; that is, due to the need to discriminate noise, this metric must be locally stable, thus there is an open neighborhood around most points in the topological input space for which the "metric" vanishes. However, the metric can be quite complex, it can have singularities, it can change over time, it can fold back onto itself, etc. This local stability may not be of interest, however, since the input may be coded so that each discrete possibility is coded as exact numbers which are separated in space. In this case the input space may not be a continuous space at all, but a discrete lattice or something else. If the input space is a lattice, then there are no small open neighborhoods around the input points, and thus even this similaity to the Euclidean metric no longer applies. At least, so far, your arguments do not seem to show anything beyond this. >It turns out that while >the classical vector space (because of the "simple" compositional >structure of its objects) allows essentially one metric consistent with >the underlying algebraic structure [1], each symbolic "space" (e.g. >strings, trees, graphs) allows infinitely many of them. Recurrent networks spread the representation of a compound symbol over time; thus, you can present a string of symbols to a recurrent network and its internal state will change. You have not shown, it seems to me, that in this case the learning metric would look anything like a Euclidean metric, or that there would be only "one" such metric. In fact it seems obvious to me that this would NOT be the case. I would like to hear why you might disagree. Mitsu Lev Goldfarb wrote: > On Thu, 13 Aug 1998, Mitsu Hadeishi wrote: > > > I'm not quite sure I agree with your analysis. Since I haven't looked at it in > > great detail, I present this as a tentative critique of your presentation. > > > > Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo > > their finite size, they are Turing equivalent), then unless you are saying that > > your represention cannot be implemented on a Turing machine, then it is clearly > > NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic > > representations. Not having looked at your scheme in detail, of course, I don't > > know whether your scheme somehow is unimplementable even on a Turing machine, but > > it seems to me you must not be claiming this. > > Yes, I'm not claiming this. Moreover, I'm not discussing the SIMULATIONAL, > or computing, power of a learning model, since in a simulation of a Turing > machine one doesn't care how the actual "parameters" are constructed based > on a small training set, i.e. no learning is involved. Many present (and > future) computational models are "Turing equivalent". That doesn't make > them learning models, does it? > > In other words, as I mentioned in my original posting, IF YOU KNOW THE > SYMBOLIC CLASS STRUCTURE of course you can simulate, or encode, it on > many machines (similar quote straight from the horse's mouse i.e. from > Siegelmann and Sontag paper "On the computational power of neural nets" , > section 1.2: "Many other types of 'machines' may be used for > universality"). Again, the discovery of the symbolic class structure is a > fundamentally different matter, much less trivial than just a simple > encoding of this structure using any other, including numeric, structure. > > > You seem to be basing your argument on the notion that the input space to a > > recurrent ANN is a set of numbers, which you interpret as the coordinates of a > > vector. However, this is only a kind of vague analogy, since the field > > operations of the vector space (addition, multiplication, etc.) have no clear > > meaning on the input space. "Adding" two input vectors does not necessarily > > result in anything meaningful except in the sense that the recurrent ANN to be > > useful must be locally stable with respect to small variations in the input. > > However, the actual structure or metric of the input space is in some sense > > determined not a priori but by the state of the recurrent ANN itself, and can > > change over time both as a result of training and as a result of iteration. The > > input space is numbers, yes, but that doesn't make it a vector space. For > > example, what properties of the input would be preserved if I, say, added the > > vector (10^25, 10^25, ...) to the input? If it is a "vector space" then that > > operation would yield something sensible, some symmetries, and yet it obviously > > does not. Thus, while I sympathize with your claim that the vector field of R(n) > > does not admit to the structure necessary to make visible much symbolic > > structure, this in itself does not doom connectionist symbol processing by any > > means. > > It appears that something VERY BASIC is missing from the above > description: How could a recurrent net learn without some metric and, as > far as I know, some metric equivalent to the Euclidean metric? (All "good' > metrics on a finite-dimensional vector space are equivalent to the > Euclidean, see [1] in my original posting.) > > > Your argument does have weight when applied to a single-layer perceptron, which > > is, after all, just a thresholded/distorted linear transformation. Although it > > seemed to take the early connectionist community by surprise, it should be no > > surprise at all that a single-layer perceptron cannot learn the parity problem, > > because obviously the parity problem is not linearly separable, and how could any > > linear discriminator possibly learn a non-linearly-separable problem? However, > > we do not live in a world of single-layer perceptrons. Because networks are more > > complex than this, arguments about the linearity of the input space seem to me > > rather irrelevant. I suspect you mean something else, however. > > Yes, I do. > > > I think the intuitive point you are perhaps trying to make is that symbolic > > representations are arbitrarily nestable (recursively recombinable), and an input > > space which consists of a fixed number of dimensions cannot handle recursive > > combinations. However, one can use time-sequence to get around this problem (as > > we all are doing when we read and write for example). Rather than make our eyes, > > for example, capable of handling arbitrarily recombinable input all at once, we > > sequence the input to our eyes by reading material over time. The same trick can > > be used with recurrent networks for example. > > Mitsu, > > I'm not making just this point. > > The main point I'm making can be stated as follows. Inductive learning > requires some object dissimilarity and/or similarity, measure(s). The > accumulated mathematical experience strongly suggests that the distance in > the input space must be consistent with the underlying operational, or > compositional, structure of the chosen object representation (e.g. > topological group, topological vector space, etc). It turns out that while > the classical vector space (because of the "simple" compositional > structure of its objects) allows essentially one metric consistent with > the underlying algebraic structure [1], each symbolic "space" (e.g. > strings, trees, graphs) allows infinitely many of them. In the latter > case, the inductive learning becomes the learning of the corresponding > class distance function (refer to my parity example). Moreover, since > some noise is always present in the training set, I cannot imagine how > RELIABLE symbolic inductive class structure can be learned from a SMALL > training set without the right symbolic bias and without the help of the > corresponding symbolic distance measures. > > Cheers, > Lev From henders at linc.cis.upenn.edu Fri Aug 14 17:01:06 1998 From: henders at linc.cis.upenn.edu (Jamie Henderson) Date: Fri, 14 Aug 1998 17:01:06 -0400 (EDT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> (Dave_Touretzky@cs.cmu.edu) Message-ID: <199808142101.RAA16445@linc.cis.upenn.edu> Dave Touretzky writes: >I'd like to start a debate on the current state of connectionist >symbol processing? Is it dead? Or does progress continue? .. >The problem, though, was that we >did not have good techniques for dealing with structured information >in distributed form, or for doing tasks that require variable binding. >While it is possible to do these things with a connectionist network, >the result is a complex kludge that, at best, sort of works for small >problems, but offers no distinct advantages over a purely symbolic >implementation. The cases where people had shown interesting >generalization behavior in connectionist nets involved simple >vector-based representations, without nested structures or variable >binding. I just gave a paper at the COLING-ACL'98 conference, which is the main international conference for Computational Linguistics. The paper is on learning to do syntactic parsing using a connectionist architecture that extends SRNs with Temporal Synchrony Variable Binding (ala SHRUTI). This architecture does generalize in a structural way, with variable binding. Crucially, the paper evaluates this learning method on a real corpus of naturally occurring text, and gets results that approach the state of the art in the field (which is all statistical methods these days). I received a surprisingly positive response to this paper. I got comments like "I've never taken connectionist NLP seriously, but you're playing the same game as us". "The game" is training and testing on large corpora of real text, not toy domains. The winner is the method with the lowest error rate. I see three morals in this: - Connectionist approaches to processing structural information have made significant progress, to the point that they can now be justified on purely empirical/engineering grounds. - Connectionist methods do solve problems that current non-connectionist methods have (ad-hoc independence assumptions, sparse data, etc.), and people working in learning know it. - Connectionist NLP researchers should be using modern empirical methods, and they will be taken seriously if they do. The paper is available from my web page (http://www.dcs.ex.ac.uk/~jamie/). Below is the reference and abstract. - Jamie Henderson Henderson, J. and Lane, P. (1998) A Connectionist Architecture for Learning to Parse. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, University of Montreal, Canada. Abstract: We present a connectionist architecture and demonstrate that it can learn syntactic parsing from a corpus of parsed text. The architecture can represent syntactic constituents, and can learn generalizations over syntactic constituents, thereby addressing the sparse data problems of previous connectionist architectures. We apply these Simple Synchrony Networks to mapping sequences of word tags to parse trees. After training on parsed samples of the Brown Corpus, the networks achieve precision and recall on constituents that approaches that of statistical methods for this task. (7 pages) ------------------------------- Dr James Henderson Department of Computer Science University of Exeter Exeter EX4 4PT, U.K. http://www.dcs.ex.ac.uk/~jamie/ jamie at dcs.ex.ac.uk ------------------------------- From arbib at pollux.usc.edu Fri Aug 14 18:07:20 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Fri, 14 Aug 1998 14:07:20 -0800 Subject: What have neural networks achieved? Message-ID: Recently, Stuart Russell addressed the following query to Fellows of the AAAI: > This Saturday there will be a debate with John McCarthy, David Israel, > Stuart Dreyfus and myself on the topic of > "How is the quest for artificial intelligence progressing?" > This is widely publicized, likely to be partially televised, > and will be attended by a lot of journalists. > > For this, and for AAAI's future reference, I'd like to collect > convincing examples of progress, particularly examples that will > convince journalists and the general public. For now all I need > is a URL or other accessible pointer and a one or two sentence > description. (It does not *necessarily* have to be your own work!) > Pictures would be very helpful. This spurs me as I work on the 2nd edition of the Handbook of Brain Theory and Neural Networks (due out in 2 years or so; MIT Press has just issued a paperback of the first edition) to pose to you two related questions: a) What are the "big success stories" (i.e., of the kind the general public could understand) for neural networks contributing to the understanding of "real" brains, i.e., within the fields of cognitive science and neuroscience. b) What are the "big success stories" (i.e., of the kind the general public could understand) for neural networks contributing to the construction of "artificial" brains, i.e., successfully fielded applications of NN hardware and software that have had a major commercial or other impact? ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From goldfarb at unb.ca Sat Aug 15 19:29:10 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sat, 15 Aug 1998 20:29:10 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D48576.7EFA0648@ministryofthought.com> Message-ID: On Fri, 14 Aug 1998, Mitsu Hadeishi wrote: > Lev, > > Okay, so we agree on the following: > > Recurrent ANNs have the computational power required. The only > thing at issue is the learning algortihm. "The ONLY thing at issue" IS the MAIN thing at issue, because the simulation of the Turing machine is just a clever game, while an adequate model of inductive learning should, among many other things, change our understanding of what science is (see, for example, Alexander Bird, Philosophy of Science, McGill-Queen's University Press, 1998). > >How could a recurrent net learn without some metric and, as > >far as I know, some metric equivalent to the Euclidean metric? > > Your arguments so far seem to be focusing on the "metric" on the > input space, but this does not in itself mean anything at all about > the metric of the learning algorithm as a whole. What does it mean "the metric of the learning algorithm as a whole"? There is no such a concept as "the metric of the learning algorithm as a whole". > Clearly the input > space is NOT a vector space in the usual sense of the word, at least > if you use the metric which is defined by the whole system (learning > algortihm, error measure, state of the network). If "the input space is NOT a vector space in the usual sense of the word", then what is it? Are we talking about the formal concepts known in mathematics or we don't care about such "trifle" things at all? Remember, that "even" physicists care about such things, and I said "even", because to model the inductive learning we will need more abstract models. > So, what you are now > saying is that the metric must be equivalent (rather than equal to) to > the Euclidean metric: you do not define what you mean by this. [metrics are equivalent if they induce the same topology, or the same convergence] > The learning "metric" in the connectionist paradigm changes over > time: it is a function of the structure of the learning algorithm and > the state of the network, as I mentioned above. The only sense in > which the metric is "equivalent" to the Euclidean metric is locally; > that is, due to the need to discriminate noise, this metric must be > locally stable, thus there is an open neighborhood around most points > in the topological input space for which the "metric" vanishes. > However, the metric can be quite complex, it can have singularities, > it can change over time, it can fold back onto itself, etc. > > This local stability may not be of interest, however, since the > input may be coded so that each discrete possibility is coded as exact > numbers which are separated in space. In this case the input space > may not be a continuous space at all, but a discrete lattice or > something else. If the input space is a lattice, then there are no > small open neighborhoods around the input points, and thus even this > similaity to the Euclidean metric no longer applies. At least, so > far, your arguments do not seem to show anything beyond this. > > >It turns out that while > >the classical vector space (because of the "simple" compositional > >structure of its objects) allows essentially one metric consistent with > >the underlying algebraic structure [1], each symbolic "space" (e.g. > >strings, trees, graphs) allows infinitely many of them. > > Recurrent networks spread the representation of a compound symbol > over time; thus, you can present a string of symbols to a recurrent > network and its internal state will change. You have not shown, it > seems to me, that in this case the learning metric would look anything > like a Euclidean metric, or that there would be only "one" such > metric. In fact it seems obvious to me that this would NOT be the > case. I would like to hear why you might disagree. Mitsu, Forgive me for the analogy, but from the above as well as from other published sources, it appears to me that in the "connectionist symbol processing", by throwing into one model two, I strongly suggest, INCOMPATIBLE ingredients (vector space model and the symbolic operations) one hopes to prepare a magic soup for inductive learning. I strongly believe that this is not a scientifically fruitful approach. Why? Can I give you a one sentence answer? If you look very carefully at the topologies induced on the set of strings (over an alphabet of size > 1) by various symbolic distances (of type given in the parity class problem), then you will discover that they have hardly anything to do with the continuous topologies we are used to from the classical mathematics. In this sense, the difficulties ANNs have with the parity problem are only the tip of the iceberg. So, isn't it scientifically more profitable to work DIRECTLY with the symbolic topologies, i.e. the symbolic distance functions, by starting with some initial set of symbolic operations and then proceeding in a systematic manner to seek the optimal topology (i.e. the optimal set of weighted operations) for the training set. To simplify things, this is what the evolving transformation system model we are developing attempts to do. It appears that there are profound connections between the relevant symbolic topologies (and hardly any connections with the classical numeric topologies). Based on those connections, we are developing an efficient inductive learning model that will work with MUCH SMALLER training set than has been the case in the past. The latter is possible due to the fact that, typically, computation of the distance between two strings involves many operations and the optimization function involves O(n*n) interdistances, where n is the size of the training set. Cheers, Lev From mitsu at ministryofthought.com Sat Aug 15 20:22:52 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 17:22:52 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D6265C.D637A673@ministryofthought.com> Lev Goldfarb wrote: > On Fri, 14 Aug 1998, Mitsu Hadeishi wrote: > > Your arguments so far seem to be focusing on the "metric" on the > > input space, but this does not in itself mean anything at all about > > the metric of the learning algorithm as a whole. > > What does it mean "the metric of the learning algorithm as a whole"? > There is no such a concept as "the metric of the learning algorithm as a > whole". Since you are using terms like "metric" extremely loosely, I was also doing so. What I mean here is that with certain connectionist schemes, for example those that use an error function of some kind, one could conceive of the error measure as a kind of distance function (however, it is not a metric formally speaking). However, the error measure is far more useful and important than the "metric" you might impose on the input space when conceiving of it as a vector space, since the input space is NOT a vector space. > If "the input space is NOT a vector space in the usual sense of the word", > then what is it? Are we talking about the formal concepts known in > mathematics or we don't care about such "trifle" things at all? > Remember, that "even" physicists care about such things, and I said > "even", because to model the inductive learning we will need more > abstract models. As you (should) know, a vector space is supposed to have vector field symmetries. For example, something should be preserved under rotations and translations of the input vectors. However, what do you get when you do arbitrary rotations of the input to a connectionist network? I don't mean rotations of, say, the visual field to a pattern recognition network, but rather taking the actual values of the inputs to each neuron in a network as coordinates to a vector, and then "rotating" them or translating them, or both. What meaning does this have when used with a recurrent connectionist architecture? It seems to me that it has very little meaning if any. > [metrics are equivalent if they induce the same topology, or the same > convergence] Again, the only really important function is the structure of the error function, not the "metric" on the input space conceived as a vector space, and it isn't even a metric in the usual sense of the word. > Can I give you a one sentence answer? If you look very carefully at the > topologies induced on the set of strings (over an alphabet of size > 1) by > various symbolic distances (of type given in the parity class problem), > then you will discover that they have hardly anything to do with the > continuous topologies we are used to from the classical mathematics. In > this sense, the difficulties ANNs have with the parity problem are only > the tip of the iceberg. I do not dispute the value of your work, I simply dispute the fact that you seem to think it dooms connectionist approaches, because your intuitive arguments against connectionist approaches are not cogent it seems to me. While your work is probably quite valuable, and I think I understand what you are getting at, I see no reason why what you are talking about would prevent a connectionist approach (based on a recurrent or more sophisticated architecture) from being able to discover the same symbolic metric---because, as I say, the input space is not in any meaningful sense a vector space, and the recurrent architecture allows the "metric" of the learning algorithm, it seems to me, to acquire precisely the kind of structure that you need it to---or, at least, I do not see in principle why it cannot. The reason this is so is again because the input is spread out over multiple presentations to the network. There are good reasons to use connectionist schemes, however, I believe, as opposed to purely symbolic schemes. For one: symbolic techniques are inevitably limited to highly discrete representations, whereas connectionist architectures can at least in theory combine both discrete and continuous representations. Two, it may be that the simplest or most efficient representation of a given set of rules may include both a continous and a discrete component; that is, for example, considering issues such as imprecise application of rules, or breaking of rules, and so forth. For example, consider poetic speech; the "rules" for interpreting poetry are clearly not easily enumerable, yet human beings can read poetry and get something out of it. A purely symbolic approach may not be able to easily capture this, whereas it seems to me a connectionist approach has a better chance of dealing with this kind of situation. I can see value in your approach, and things that connectionists can learn from it, but I do not see that it dooms connectionism by any means. Mitsu > > > So, isn't it scientifically more profitable to work DIRECTLY with the > symbolic topologies, i.e. the symbolic distance functions, by starting > with some initial set of symbolic operations and then proceeding in a > systematic manner to seek the optimal topology (i.e. the optimal set of > weighted operations) for the training set. To simplify things, this is > what the evolving transformation system model we are developing attempts > to do. It appears that there are profound connections between the relevant > symbolic topologies (and hardly any connections with the classical numeric > topologies). Based on those connections, we are developing an efficient > inductive learning model that will work with MUCH SMALLER training set > than has been the case in the past. The latter is possible due to the fact > that, typically, computation of the distance between two strings involves > many operations and the optimization function involves O(n*n) > interdistances, where n is the size of the training set. > > Cheers, > Lev From mitsu at ministryofthought.com Sat Aug 15 20:37:06 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 17:37:06 -0700 Subject: Connectionist symbol processing: any progress? References: <35D6265C.D637A673@ministryofthought.com> Message-ID: <35D629B2.83598619@ministryofthought.com> Mitsu Hadeishi wrote: > Lev Goldfarb wrote: > > However, the error measure is far more useful and important than > the "metric" you might impose on the input space when conceiving of it as a > vector space, since the input space is NOT a vector space. Clarification: I really should say you do not have to conceive of the input space as a vector space. It may in fact behave like a vector space (locally) if the architecture of the network, the nature of the learning algorithm, and the training sets are structured in a particular way. However, it will not necessarily behave this way as the network evolves---and particularly if you conceive of the input space as spread out through time for a recurrent network, the notion of it as a vector space doesn't work at all. The main point is that it is the feedback mechanism (error function or other mechanism) which is truly important when considering how the learning algorithm is biased and will evolve, not the "metric" on the initial input space. Mitsu From goldfarb at unb.ca Sat Aug 15 22:32:12 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sat, 15 Aug 1998 23:32:12 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D6265C.D637A673@ministryofthought.com> Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > Since you are using terms like "metric" extremely loosely, I was also doing > so. Please, note that although I'm not that precise, I have not used the "terms like 'metric' extremely loosely". > What I mean here is that with certain connectionist schemes, for example > those that use an error function of some kind, one could conceive of the error > measure as a kind of distance function (however, it is not a metric formally > speaking). However, the error measure is far more useful and important than > the "metric" you might impose on the input space when conceiving of it as a > vector space, since the input space is NOT a vector space. If the input and "the intermediate" spaces are not vector spaces, then what is the advantage of the "connectionist" architectures? > > If "the input space is NOT a vector space in the usual sense of the word", > > then what is it? Are we talking about the formal concepts known in > > mathematics or we don't care about such "trifle" things at all? > > Remember, that "even" physicists care about such things, and I said > > "even", because to model the inductive learning we will need more > > abstract models. > > As you (should) know, a vector space is supposed to have vector field > symmetries. For example, something should be preserved under rotations and > translations of the input vectors. However, what do you get when you do > arbitrary rotations of the input to a connectionist network? I don't mean > rotations of, say, the visual field to a pattern recognition network, but > rather taking the actual values of the inputs to each neuron in a network as > coordinates to a vector, and then "rotating" them or translating them, or > both. What meaning does this have when used with a recurrent connectionist > architecture? It seems to me that it has very little meaning if any. > > > [metrics are equivalent if they induce the same topology, or the same > > convergence] > > Again, the only really important function is the structure of the error > function, not the "metric" on the input space conceived as a vector space, and > it isn't even a metric in the usual sense of the word. > > > Can I give you a one sentence answer? If you look very carefully at the > > topologies induced on the set of strings (over an alphabet of size > 1) by > > various symbolic distances (of type given in the parity class problem), > > then you will discover that they have hardly anything to do with the > > continuous topologies we are used to from the classical mathematics. In > > this sense, the difficulties ANNs have with the parity problem are only > > the tip of the iceberg. > > I do not dispute the value of your work, I simply dispute the fact that you > seem to think it dooms connectionist approaches, because your intuitive > arguments against connectionist approaches are not cogent it seems to me. > While your work is probably quite valuable, and I think I understand what you > are getting at, I see no reason why what you are talking about would prevent a > connectionist approach (based on a recurrent or more sophisticated > architecture) from being able to discover the same symbolic metric---because, > as I say, the input space is not in any meaningful sense a vector space, and > the recurrent architecture allows the "metric" of the learning algorithm, it > seems to me, to acquire precisely the kind of structure that you need it > to---or, at least, I do not see in principle why it cannot. The reason this > is so is again because the input is spread out over multiple presentations to > the network. > > There are good reasons to use connectionist schemes, however, I believe, as > opposed to purely symbolic schemes. For one: symbolic techniques are > inevitably limited to highly discrete representations, whereas connectionist > architectures can at least in theory combine both discrete and continuous > representations. The main reason we are developing the ETS model is precisely related to the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS as opposed to the unnatural ones. I would definitely say (and you would probably agree) that (if, indeed, this is the case) it is the most important consideration. Moreover, it turns out that the concept of a fuzzy set, which was originally introduced in a rather artificial manner that didn't clarify the underlying source of fuzziness (and this have caused an understandable and substantial resistance to its introduction), emerges VERY naturally within the ETS model: the definition of the class via the corresponding distance function typically and naturally induces the fuzzy class boundary and also reveals the source of fuzziness, which includes the interplay between the corresponding weighted operations and (in the case of noise in the training set) a nonzero radius. Note that in the parity class problem, the parity class is not fuzzy, as reflected in the corresponding weighting scheme and the radius of 0. > Two, it may be that the simplest or most efficient > representation of a given set of rules may include both a continous and a > discrete component; that is, for example, considering issues such as imprecise > application of rules, or breaking of rules, and so forth. For example, > consider poetic speech; the "rules" for interpreting poetry are clearly not > easily enumerable, yet human beings can read poetry and get something out of > it. A purely symbolic approach may not be able to easily capture this, > whereas it seems to me a connectionist approach has a better chance of dealing > with this kind of situation. > > I can see value in your approach, and things that connectionists can learn > from it, but I do not see that it dooms connectionism by any means. See the previous comment. Cheers, Lev From mitsu at ministryofthought.com Sat Aug 15 23:47:28 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 20:47:28 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D65650.3B387A0D@ministryofthought.com> Lev Goldfarb wrote: > On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > > Since you are using terms like "metric" extremely loosely, I was also doing > > so. > > Please, note that although I'm not that precise, I have not used the > "terms like 'metric' extremely loosely". I am referring to this statement: >How could a recurrent net learn without some metric and, as >far as I know, some metric equivalent to the Euclidean metric?Here you are talking about the input space as though the Euclidean metric on that space is particularly key, when it is rather the structure of the whole network, the feedback scheme, the definition of the error measure, the learning algortihm, and so forth which actually create the relevant and important mathematical structure. In a sufficiently complex network, you can pretty much get any arbitrary map you like from the input space to the output, and the error measure is biased by the specific nature of the training set (for example), and is measured on the output of the network AFTER it has gone through what amounts to an arbitrary differentiable transformation. By this time, the "metric" on the original input space can be all but destroyed. Add recurrency and you even get rid of the fixed dimensionality of the input space. In the quote above, it appears you are implying that there is some direct relationship between the metric on the initial input space and the operation of the learning algorithm. I do not see how this is the case. > The main reason we are developing the ETS model is precisely related to > the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and > fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS > as opposed to the unnatural ones. I would definitely say (and you would > probably agree) that (if, indeed, this is the case) it is the most > important consideration. > > Moreover, it turns out that the concept of a fuzzy set, which was > originally introduced in a rather artificial manner that didn't clarify > the underlying source of fuzziness (and this have caused an understandable > and substantial resistance to its introduction), emerges VERY naturally > within the ETS model: the definition of the class via the corresponding > distance function typically and naturally induces the fuzzy class boundary > and also reveals the source of fuzziness, which includes the interplay > between the corresponding weighted operations and (in the case of noise in > the training set) a nonzero radius. Note that in the parity class problem, > the parity class is not fuzzy, as reflected in the corresponding weighting > scheme and the radius of 0. Well, what one mathematician calls natural and the other calls artificial may be somewhat subject to taste as well as rational argument. At this point one can get into the realm of mathematical aesthetics or philosophy rather than hard science. From Tony.Plate at MCS.VUW.AC.NZ Sun Aug 16 04:30:59 1998 From: Tony.Plate at MCS.VUW.AC.NZ (Tony Plate) Date: Sun, 16 Aug 1998 20:30:59 +1200 Subject: Connectionist symbol processing: any progress? In-Reply-To: Your message of "Tue, 11 Aug 1998 03:34:27 -0400." <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> Work has been progressing on higher-level connectionist processing, but progress has not been blindingly fast. As others have noted, it is a difficult area. One of things that has recently renewed my interest in the idea of using distributed representations for processing complex information was finding out about Latent Semantic Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method for taking a large corpus of text and constructing vector representations for words in such a way that similar words are represented by similar vectors. LSA works by representing a word by its context (harkenning back to a comment I recently saw attributed to Firth 1957: "You shall know a word by the company it keeps" :-), and then reducing the dimensionality of the context using singular value decomposition (SVD) (v. closely related to principal component analysis (PCA)). The vectors constructed by LSA can be of any size, but it seems that moderately high dimensions work best: 100 to 300 elements. It turns out that one can do all sorts of surprising things with these vectors. One can construct vectors which represent documents and queries by merely summing the vectors for their words and do information retrieval, automatically getting around the problem of synonyms (since synonyms tend to have similar vectors). One can do the same thing with questions and multiple choice answers and pass exams (e.g., first year psychology exams, TOEFL tests). And all this just treating texts as unordered bags of words. While these results are intriguing, they don't achieve the goal of complex connectionist reasoning. However, they could provide an excellent source of representations for use in a more complex connectionist system (using connectionist in a very broad sense here). LSA is fast enough that it can be used on 10s of thousands of documents to derive vectors for thousands of words. This is exiting because it could allow one to start building connectionist systems which deal with full-range vocabularies and large varied task sets (as in info. retrieval and related tasks), and which do more interesting processing than just forming the bag-of-words content of a document a la vanilla-LSA. As Ross Gayler mentioned, analogy processing is a very promising area for application of connectionist ideas. There are a few reasons for this being interesting: people do it all the time, structural relationships are important to the task, no explicit variables need be involved, and rule-based reasoning can be seen as a very specialized version of the task. One very interesting model of analogical processing that was presented at the workshop in Bulgaria (in July) was John Hummel and Kieth Holyoak's LISA model (ref at end). This model uses distributed representations for roles and fillers, binding them together with temporal synchrony, and achieves quite impressive results (John, in case you're listening, this is not to say that I think temporal binding is the right way to go, but it's an impressive model and presents a good challenge to other approaches.) I have to disagree with two of the technical comments made in this discussion: Jerry Feldman wrote: "Parallel Distributed Processing is a contradiction in terms. To the extent that representing a concept involves all of the units in a system, only one concept can be active at a time." One can easily represent more than one concept at a time in distributed representations. One of their beauties is the soft limit on the number of concepts that can be represented at once. This limit depends on the dimensionality of the system, the redundancy in representations, the similarity structure of the concepts, and so forth. All of the units in the system might be involved in representing a concept, but redundancy makes none essential. And of course one can also have different modules within a system. But, my point is that even within a single PDP module, one can still represent (and process) multiple concepts at once. Mitsu Hadeishi wrote: "an input space which consists of a fixed number of dimensions cannot handle recursive combinations" A number of people, including myself, have shown that it is possible to represent arbitrarily nested concepts in space with a fixed number of dimensions. Furthermore, the resulting representations have interesting and useful properties not shared by their symbolic counterparts. Very briefly, the way one can do this is by using vector-space operations for addition and multiplication to implement the conceptual operations of forming collections and binding concepts, respectively. For example, one can build a distributed representation for a shape configuration#33 of "circle above triangle" as: config33 = vertical + circle + triangle + ontop*circle + below*triangle By using an appropriate multiplication operation (I used circular, or wrapped, convolution), the reduced representation of the compositional concept (e.g., config33) has the same dimension as its components, and can readily be used as a component in other higher-level relations. Quite a few people have devised schemes for this type of representation, e.g., Paul Smolensky (Tensor Products), Jordan Pollack (RAAMs), Allesandro Sperduti (LRAAMs), Pentti Kanerva (Binary Spatter Codes). Another related scheme that uses distributed representations and tensor product bindings (but not role-filler bindings) is Halford, Wilson and Philips STAR model. Some of the useful properties that of these types of distributed representations are as follows: (a) The reduced, distributed representation (e.g., config33) functions like a pointer, but is more that a mere pointer in that information about it contents is available directly without having to "follow" the "pointer." This makes it possible to do some types processing without having to unpack the structures. (b) The vector-space similarity of representations (i.e., the dot-product) reflects both superficial and structural similarity of structures. (c) There are fast, approximate, vector-space techniques for doing "structural" computations like finding corresponding objects in two analogies, or doing structural transformations. Some references: (Lots of LSA-related papers at: http://lsa.colorado.edu/ http://superbook.bellcore.com/~std/LSI.papers.html ) @article{deerwester-dumais-landauer-furnas-harshman-90, author = "S. Deerwester and S. T. Dumais and T. K. Landauer and G. W. Furnas and R. A. Harshman", year = "1990", title = "Indexing by latent semantic analysis", journal = "Journal of the Society for Information Science", volume = "41", number = "6", pages = "391-407", annote = "first technical LSI paper; good background." } @inproceedings{landauer-laham-foltz-98, author = "T. K. Landauer and D. Laham and P. W. Foltz", title = "Learning Human-like Knowledge with Singular Value Decomposition: A Progress Report", booktitle = "Neural Information Processing Systems (NIPS*97)", year = "1998" } @article{landauer-dumais-97, author = "T. K. Landauer and S. T. Dumais", year = "1997", title = "Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge", journal = "Psychological Review", pages = "211-240", volume = "104", number = "2" } @inproceedings{bartell-cottrell-belew-92, author = "B.T. Bartell and G.W. Cottrell and R.K. Belew", year = "1992", title = "{Latent Semantic Indexing} is an optimal special case of multidimensional scaling", booktitle = "Proc SIGIR-92", publisher = "ACM Press", address = "New York" } @article{hummel-holyoak-97, author = "J. E. Hummel and K. J. Holyoak", title = "Distributed representations of structure: {A} theory of analogical access and mapping", journal = "Psychological Review", year = 1997, volume = 104, number = 3, pages = "427--466", annote = "LISA paper" } @inproceedings{kanerva-96, author = "P. Kanerva", year = 1996, title = "Binary spatter-coding of ordered K-tuples", volume = 1112, pages = "869-873", publisher = "Springer", editor = "C. von der Malsburg and W. von Seelen and J.C. Vorbruggen and B. Sendhoff", booktitle = "Artificial Neural Networks--ICANN Proceedings", series = "Lecture Notes in Computer Science", address = "Berlin", keywords = "HRRs, distributed representations" } @unpublished{halford-wilson-phillips-bbs98, author = "Halford, Graeme and Wilson, William H. and Phillips, Steven", title = "Processing Capacity Defined by Relational Complexity: Implications for Comparative, Developmental, and Cognitive Psychology", note = "Behavioral and Brain Sciences", year = "to appear" } @InBook{plate-97c, author = "Tony A. Plate", chapter = "A Common Framework for Distributed Representation Schemes for Compositional Structure", title = "Connectionist Systems for Knowledge Representation and Deduction", publisher = "Queensland University of Technology", year = "1997", editor = "Fr\'ed\'eric Maire and Ross Hayward and Joachim Diederich", pages = "15-34" } @incollection{plate-98, author = "Tony Plate", title = "Analogy retrieval and processing with distributed represenations", year = "1998", booktitle = "Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences", pages = "154--163", editor = "Keith Holyoak and Dedre Gentner and Boicho Kokinov", publisher = "NBU Series in Cognitive Science, New Bugarian University, Sofia." } Tony Plate, Computer Science Voice: +64-4-495-5233 ext 8578 School of Mathematical and Computing Sciences Fax: +64-4-495-5232 Victoria University, PO Box 600, Wellington, New Zealand tap at mcs.vuw.ac.nz http://www.mcs.vuw.ac.nz/~tap From bryan at cog-tech.com Sun Aug 16 10:18:49 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 10:18:49 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808161418.KAA30062@cti2.cog-tech.com> Lev wrote: > Can I give you a one sentence answer? If you look very carefully at > the topologies induced on the set of strings (over an alphabet of > size > 1) by various symbolic distances (of type given in the parity > class problem), then you will discover that they have hardly > anything to do with the continuous topologies we are used to from > the classical mathematics. In this sense, the difficulties ANNs have > with the parity problem are only the tip of the iceberg. Mitsu wrote: > I see no reason why what you are talking about would prevent a > connectionist approach (based on a recurrent or more sophisticated > architecture) from being able to discover the same symbolic > metric---because, as I say, the input space is not in any meaningful > sense a vector space, and the recurrent architecture allows the > "metric" of the learning algorithm, it seems to me, to acquire > precisely the kind of structure that you need it to---or, at least, > I do not see in principle why it cannot. The reason this is so is > again because the input is spread out over multiple presentations to > the network. > There are good reasons to use connectionist schemes, however, I > believe, as opposed to purely symbolic schemes. For one: symbolic > techniques are inevitably limited to highly discrete > representations, whereas connectionist architectures can at least in > theory combine both discrete and continuous representations. "Connectionist" is too broad a term to distinguish inherently symbolic from approaches which are not inherently symbolic, but which have yet to be clearly excluded from being able to induce approximately symbolic processing solutions. In an attempt to characterize these two approaches, the one builds in symbolic processing structure (this is certainly true for Shruti and, from reading Lev's messages, appears to be true of that research as well), while the other intends to utilize a "recurrent or more sophisticated architecture" to induce the desired behavior without "special" mechanisms. It is certainly true that we have the ability to, and, of necessity, must, construct connectionist systems with different inductive biases. A recurrent MLP (multi-layer-perceptron) *typically* builds in scalar weights, sigmoid transfer functions, high-forward connectivity, recurrent connections, etc. Simultaneous recurrent networks are similar, but build in a settling process by which an output/behavior is computed. In the work with Shruti, we have built into a simultaneous recurrent network localized structure and transfer functions which facilitate "symbolic" processing. While such specialized structure does not preclude using, e.g., backpropagation for learning, it also opens up explicit search of the structure space by methods more similar to evolutionary programming. My point, here, is not that we have the "right" solution, but that the architectural variations which are being discussed need not be exclusive. Given a focus on "symbolic" processing, I suggest that there are two issues which have dominated this discussion: - What inductive biases should be built into connectionist architectures for this class of problems? This question should include choices of "structure", and "learning rules". - What meaningful differences exist in the learned behavior of systems with different inductive biases. Especially, questions of rigidity and generalization of the solutions, the efficiency of learning, and the preservation of plasticity seem important. I feel that Lev is concerned that learning algorithms using recurrent networks with distributed representations have an inductive bias which limits their practical capacity to induce solutions (internal representations / transforms) for domains in which symbol processing is a critical. I agree with this "intuitively," but I would like to see a firmer characterization of why such networks are ill-suited for "symbolic processing" (quoted to indicate that good solutions need not be purely symbolic and could exhibit aspects of more classical ANNs). I am thinking about an effort several years ago which was made to characterize problems (and representations) which were "GA" hard -- this is, which were ill suited to the transforms and inductive biases of (certain classes of) genetic algorithms. A similar effort with various classes of connectionist architectures would be quite useful in moving beyond such "intuitive" senses of the fitness of different approaches and the expected utility of research in different connectionist solutions for different classes of problems. I feel that it is a reasonable argument that evolution has facilitated us with both gross and localized structure. That includes the body and the brain. Within the "brain" there are clearly systems that are structurally (pre-)disposed for different kinds of computing, witness the cerebellum vs the cerebral cortex. We do not need to, and should not, make the same structural choices for connectionist solutions for different classes of problems. My own intuitive "argument" leads me to believe that distributed connectionist solutions are unlikely to prove suitable for symbolic processing. Recurrent, and simultaneous recurrent, distributed networks may posses the representational capacity, but I maintain doubts concerning their inductive capacity for "symbolic" domains. Perhaps a fruitful approach would be to enumerate characteristics of a system which facilitate learning and behavior in domains which are considered "symbolic" (including variable binding, appropriate generalization, plasticity, etc.), and to see how those properties might be realized or approximated within the temporal dynamics of a class of distributed recurrent networks. This effort must, of course, not seek to allocate too much responsibility to single system and therefore, needs be part of a broader theory of the structure of mind and organism. If we consider that the primary mechanism of recurrence in a distributed representations as enfolding space into time, I still have reservations about the complexity that the agent / organism faces in learning an enfolding of mechanisms sufficient to support symbolic processing. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From rsun at research.nj.nec.com Sun Aug 16 19:24:01 1998 From: rsun at research.nj.nec.com (Ron Sun) Date: Sun, 16 Aug 1998 19:24:01 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808162324.TAA07433@pc-rsun.nj.nec.com> Along the line of Tony's summary of work on distributed connectionist models, here is my (possibly biased) summary of the state of the art of localist connectionist symbolic processing work. There have been a variety of work in developing LOCALIST connectionist models for symbolic processing, as pointed out by postings of Jerry Feldman and Shastri. The work spans a large spectrum of application areas in AI and cognitive science. Although it has been discussed somewhat, a more detailed list of work in this area include: ------------------------ REASONING (commonsense reasoning, logic reasoning, case-based reasoning, reasoning based on schemas/frames) L. Shastri and V. Ajjanagadde (1993). From goldfarb at unb.ca Sun Aug 16 21:20:59 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sun, 16 Aug 1998 22:20:59 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D65650.3B387A0D@ministryofthought.com> Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > Lev Goldfarb wrote: > > > On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > > > > Since you are using terms like "metric" extremely loosely, I was also doing > > > so. > > > > Please, note that although I'm not that precise, I have not used the > > "terms like 'metric' extremely loosely". > > I am referring to this statement: > > >How could a recurrent net learn without some metric and, as > >far as I know, some metric equivalent to the Euclidean metric?Here you are talking > about the input space as though the Euclidean metric on that space is particularly > key, when it is rather the structure of the whole network, the feedback scheme, the > definition of the error measure, the learning algortihm, and so forth which actually > create the relevant and important mathematical structure. Mitsu, I'm afraid, I failed to see what is wrong with my (quoted) question. First, I suggested in it that to do inductive learning properly ONE MUST HAVE AN EXPLICIT AND MEANINGFUL DISTANCE FUNCTION ON THE INPUT SPACE. And, second, given the latter plus the "foundations of the connectionism" (e.g. Michael Jordan's chapter 9, in the PDP, vol.1), if, indeed, one wants to use the n-tuple of real numbers as the input representation, then it is very natural to assume (at least for a mathematician) that the input space is a vector space, with the resulting necessity of an essentially unique metric on it (if the metric is consistent with the underlying vector space structure, which is practically a universal assumption in mathematics, see [2] in my first posting). > In a sufficiently complex > network, you can pretty much get any arbitrary map you like from the input space to > the output, and the error measure is biased by the specific nature of the training > set (for example), and is measured on the output of the network AFTER it has gone > through what amounts to an arbitrary differentiable transformation. By this time, > the "metric" on the original input space can be all but destroyed. Add recurrency > and you even get rid of the fixed dimensionality of the input space. In the quote > above, it appears you are implying that there is some direct relationship between > the metric on the initial input space and the operation of the learning algorithm. > I do not see how this is the case. YES, INDEED, I AM STRONGLY SUGGESTING THAT THERE MUST BE A DIRECT CONNECTION "BETWEEN THE METRIC ON THE INITIAL INPUT SPACE AND THE OPERATIONS OF THE LEARNING ALGORITHM". IN OTHER WORDS, THE SET OF CURRENT OPERATIONS ON THE REPRESENTATION SPACE (WHICH, OF COURSE, CAN NOW BE DYNAMICALLY MODIFIED DURING LEARNING) SHOULD ALWAYS BE USED FOR DISTANCE COMPUTATION. What is the point of, first, changing the symbolic representation to the numeric representation, and, then, applying to this numeric representation "very strange", symbolic, operations? I absolutely fail to see the need for such an artificial contortion. > > The main reason we are developing the ETS model is precisely related to > > the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and > > fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS > > as opposed to the unnatural ones. I would definitely say (and you would > > probably agree) that (if, indeed, this is the case) it is the most > > important consideration. > > > > Moreover, it turns out that the concept of a fuzzy set, which was > > originally introduced in a rather artificial manner that didn't clarify > > the underlying source of fuzziness (and this have caused an understandable > > and substantial resistance to its introduction), emerges VERY naturally > > within the ETS model: the definition of the class via the corresponding > > distance function typically and naturally induces the fuzzy class boundary > > and also reveals the source of fuzziness, which includes the interplay > > between the corresponding weighted operations and (in the case of noise in > > the training set) a nonzero radius. Note that in the parity class problem, > > the parity class is not fuzzy, as reflected in the corresponding weighting > > scheme and the radius of 0. > > Well, what one mathematician calls natural and the other calls artificial may be > somewhat subject to taste as well as rational argument. At this point one can get > into the realm of mathematical aesthetics or philosophy rather than hard science. > >From my point of view, symbolic representations can be seen as merely emergent > phenomena or patterns of behavior of physical feedback systems (i.e., looking at > cognition as essentially a bounded feedback system---bounded under normal > conditions, unless the system goes into seizure (explodes mathematically---well, it > is still bounded but it tries to explode!), of course.) From this point of view > both symbols and fuzziness and every other conceptual representation are neither > "true" nor "real" but simply patterns which tend to be, from an > information-theoretic point of view, compact and useful or efficient > representations. But they are built on a physical substrate of a feedback system, > not vice-versa. > > However, it isn't the symbol, fuzzy or not, which is ultimately general, it is the > feedback system, which is ultimately a physical system of course. So, while we may > be convinced that your formalism is very good, this does not mean it is more > fundamentally powerful than a simulation approach. It may be that your formalism is > in fact better for handling symbolic problems, or even problems which require a > mixture of fuzzy and discrete logic, etc., but what about problems which are not > symbolic at all? What about problems which are both symbolic and non-symbolic (not > just fuzzy, but simply not symbolic in any straightforward way?) > > The fact is, intuitively it seems to me that some connectionist approach is bound to > be more general than a more special-purpose approach. This does not necessarily > mean it will be as good or fast or easy to use as a specialized approach, such as > yours. But it is not at all convincing to me that just because the input space to a > connectionist network looks like R(n) in some superficial way, this would imply that > somehow a connectionist model would be incapable of doing symbolic processing, or > even using your model per se. The last paragraphs betray your classical physical bias based on our present (incidentally vector-space based) mathematics. As you can see from my home page, I do not believe in it any more: we believe that the (inductive) symbolic representation is a more basic and much more adequate (evolved during the evolution) form of representation, while the numeric form is a very special case of the latter when the alphabet consists of a single letter. By the way, I'm not the only one to doubt the adequacy of the classical form of representation. For example, here are two quotes from Erwin Schrodinger's book "Science and Humanism" (Cambridge Univ. Press), a substantial part of which is devoted to a popular explication of the following ideas: "The observed facts (about particles and light and all sorts of radiation and their mutual interaction) appear to be REPUGNANT to the classical ideal of continuous description in space and time." "If you envisage the development of physics in THE LAST HALF-CENTURY, you get the impression that the discontinuous aspect of nature has been forced upon us VERY MUCH AGAINST OUR WILL. We seemed to feel quite happy with the continuum. Max Plank was seriously frightened by the idea of a discontinuous exchange of energy . . ." (italics are in the original) Cheers, Lev From mitsu at ministryofthought.com Sun Aug 16 22:03:32 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sun, 16 Aug 1998 19:03:32 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D78F73.FC2C6E08@ministryofthought.com> Lev Goldfarb wrote: > Mitsu, I'm afraid, I failed to see what is wrong with my (quoted) > question. First, I suggested in it that to do inductive learning properly > ONE MUST HAVE AN EXPLICIT AND MEANINGFUL DISTANCE FUNCTION ON THE INPUT > SPACE. The point I am making is simply that after one has transformed the input space, two points which begin "close together" (not infinitesimally close, but just close) may end up far apart and vice versa. The mapping can be degenerate, singular, etc. Why is the metric on the initial space, then, so important, after all these transformations? Distance measured in the input space may have very little correlation with distance in the output space. Also, again, you continue to fail to address the fact that the input may be presented in time sequence (i.e., a series of n-tuples). What about that? In fact the structure of the whole thing may end up looking very much like your symbolic model. > > In a sufficiently complex > > network, you can pretty much get any arbitrary map you like from the input space to > > the output, and the error measure is biased by the specific nature of the training > > set (for example), and is measured on the output of the network AFTER it has gone > > through what amounts to an arbitrary differentiable transformation. By this time, > > the "metric" on the original input space can be all but destroyed. Add recurrency > > and you even get rid of the fixed dimensionality of the input space. In the quote > > above, it appears you are implying that there is some direct relationship between > > the metric on the initial input space and the operation of the learning algorithm. > > I do not see how this is the case. > > YES, INDEED, I AM STRONGLY SUGGESTING THAT THERE MUST BE A DIRECT > CONNECTION "BETWEEN THE METRIC ON THE INITIAL INPUT SPACE AND THE > OPERATIONS OF THE LEARNING ALGORITHM". IN OTHER WORDS, THE SET OF CURRENT > OPERATIONS ON THE REPRESENTATION SPACE (WHICH, OF COURSE, CAN NOW BE > DYNAMICALLY MODIFIED DURING LEARNING) SHOULD ALWAYS BE USED FOR DISTANCE > COMPUTATION. > > What is the point of, first, changing the symbolic representation to the > numeric representation, and, then, applying to this numeric representation > "very strange", symbolic, operations? I absolutely fail to see the need > for such an artificial contortion. If your problem is purely symbolic you may be right, but what if it isn't? (Also: no need to shout.) > > Well, what one mathematician calls natural and the other calls artificial may be > > somewhat subject to taste as well as rational argument. At this point one can get > > into the realm of mathematical aesthetics or philosophy rather than hard science. > > >From my point of view, symbolic representations can be seen as merely emergent > > phenomena or patterns of behavior of physical feedback systems (i.e., looking at > > cognition as essentially a bounded feedback system---bounded under normal > > conditions, unless the system goes into seizure (explodes mathematically---well, it > > is still bounded but it tries to explode!), of course.) From this point of view > > both symbols and fuzziness and every other conceptual representation are neither > > "true" nor "real" but simply patterns which tend to be, from an > > information-theoretic point of view, compact and useful or efficient > > representations. But they are built on a physical substrate of a feedback system, > > not vice-versa. > > > > However, it isn't the symbol, fuzzy or not, which is ultimately general, it is the > > feedback system, which is ultimately a physical system of course. So, while we may > > be convinced that your formalism is very good, this does not mean it is more > > fundamentally powerful than a simulation approach. It may be that your formalism is > > in fact better for handling symbolic problems, or even problems which require a > > mixture of fuzzy and discrete logic, etc., but what about problems which are not > > symbolic at all? What about problems which are both symbolic and non-symbolic (not > > just fuzzy, but simply not symbolic in any straightforward way?) > > > > The fact is, intuitively it seems to me that some connectionist approach is bound to > > be more general than a more special-purpose approach. This does not necessarily > > mean it will be as good or fast or easy to use as a specialized approach, such as > > yours. But it is not at all convincing to me that just because the input space to a > > connectionist network looks like R(n) in some superficial way, this would imply that > > somehow a connectionist model would be incapable of doing symbolic processing, or > > even using your model per se. > > The last paragraphs betray your classical physical bias based on our > present (incidentally vector-space based) mathematics. As you can see from > my home page, I do not believe in it any more: we believe that the > (inductive) symbolic representation is a more basic and much more adequate > (evolved during the evolution) form of representation, while the numeric > form is a very special case of the latter when the alphabet consists of a > single letter. It is quite often possible to describe one representation in terms of another; symbolic in terms of numbers, and vice-versa. What does this prove? You can say numbers are an alphabet with only one letter; I can describe alphabets with numbers, too. The real question is, which representation is natural for any given problem. Obviously symbolic representations have value and are parsimonious for certain problem domains, or they wouldn't have evolved in nature. But to say your discovery, great as it might be, is the only "natural" representation seems rather strange. Clearly, mechanics can be described rather elegantly using numbers, and there are lots of beautiful symmetries and so forth using that description. I am willing to believe other descriptions may be better for other situations, but I do not believe that it is reasonable to say that one can be certain that any given representation is *clearly* more natural than another. It depends on the situation. Symbolic representations have evolved, but so have numeric representations. They have different applications, and you can transform between them. Is one fundamentally "better" than another? Maybe better for this or that problem, but I do not believe it is reasonable to say they are better in some absolute sense. I am a "representation agnostic." I certainly am not going to say that numeric representations are the "only" valid basis, or even that they are foundational (to me that would be incoherent). All representations I believe are kind of stable information points reached as a result of dynamic feedback; in other words, they survive because they have evolutionary value. Whether you call this or that representation "real" or "better" to me is a matter of application and parsimony. The ultimate test is seeing how simple a description of a model is in any given representation. If the description is complex and long, the representation is not efficient; if it is short, it isn't. However, for generality one might choose a less parsimonious representation so you can gain expressive power over a greater range of models. Whether your model is better than connectionist models I do not know, but I do not think it is necessary to think of it as some kind of absolute choice. May the best representation win, as it were (it is a matter of survival of the fittest representation.) Mitsu > By the way, I'm not the only one to doubt the adequacy of the classical > form of representation. For example, here are two quotes from Erwin > Schrodinger's book "Science and Humanism" (Cambridge Univ. Press), a > substantial part of which is devoted to a popular explication of the > following ideas: > > "The observed facts (about particles and light and all sorts of radiation > and their mutual interaction) appear to be REPUGNANT to the classical > ideal of continuous description in space and time." > > "If you envisage the development of physics in THE LAST HALF-CENTURY, you > get the impression that the discontinuous aspect of nature has been forced > upon us VERY MUCH AGAINST OUR WILL. We seemed to feel quite happy with the > continuum. Max Plank was seriously frightened by the idea of a > discontinuous exchange of energy . . ." > > (italics are in the original) > > Cheers, > Lev From bryan at cog-tech.com Sun Aug 16 22:04:25 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 22:04:25 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808170204.WAA32648@cti2.cog-tech.com> Tony Plate's response is interesting and I, for one, will have to give it some thought. I am not certain that > concepts, respectively. For example, one can build a distributed > representation for a shape configuration#33 of "circle above > triangle" as: config33 = vertical + circle + triangle > + > ontop*circle + > below*triangle > > By using an appropriate multiplication > operation (I used > circular, or wrapped, convolution), the > reduced > representation of the compositional concept (e.g., > config33) > has the same dimension as its components, and can > readily be > used as a component in other higher-level relations. > Quite is inherently different from a spatial approach and, hence, a localist approach itself. You need to have enough dimensionality to represent the key features as well as enough to multiply them out by the key relational features -- quite a few dimensions, even if some of that dimensionality is pushed off into numerical precision. It sounds suspiciously like a localist (i.e., locally spatial) encoding. Frankly, I imagine that even a temporal encoding must be localist if it is to show "symbolic processing" behavior. That is, the temporal encoding must be striated with patterned regions that are, themselves, interpretable elements -- compositionality in time vs space. If I am willing to call both temporal encoding and spatial encoding schemes localist, then what would I consider "distributed?" To the extent which this is a meaningful distinction, I would have to say that "distributed" refers to the equi-presence of the encoding of an entity or compositional relation among all elements of the representation, e.g., equally present in all internal variables in a recurrent network. This is perhaps the intent of people who point to "distributed" representations and say that they can only encode a single entity at a time. When such systems are forced to encode compositional representations, they are also forced to develop decidedly non-equal distributions of the information across the elements of the representation. That is, they *must* become localist in time or in space to encode things compositionally. If this line of conjecture is correct, then localist and distributed are simple the ordinate directions on an axis of representation that reflects the compositionality of information, and spatial / temporal are the ordinate directions of an orthogonal axis reflecting how information is encoded within a fixed set of resources. Clearly this sense of distributed vs localist is directly tied to the connectivity of the network and the degree to which weights are global vs local. Another "upside" of localism, however achieved, is that is results in structured credit assignment -- weight or dynamics changes exert only a localized influence on the network behavior and do not disturb unrelated dynamics. My challenge for spatial encoding schemes is that they seem profoundly challenged by metaphor. For example, "Life is like a garden." This saying, when considered, immediately enacts a deep correspondence, an *invariance*, between two different *sets* of systematic relations (each defined over a different set of entities). If relations are spatially encoded, then it is beyond me how such systematic correspondences can be enacted by the dynamic activation of a single new relation. As I consider the ways in which I relate to a garden, the metaphor expands for me systematically parallel ways in which I may relate to life as well. For example, you sow seeds, tend them, and harvest nourishing rewards. The seeds become metaphorical, e.g., as new beginnings, and the parallel yields an interpretation in "life". (Other inferences which can be systematically drawn -- it takes a lot of "fertilizer" to grow anything :} and sometimes I can't tell which is the weed and which is the seedling.) If we allocate spatial encoding to systematic relations, then how can we apply those systematic relations to new semantics -- both "instantly" and without loss of the original interpretations? In fact, our understanding typically grows for both domains illuminated by the metaphor. For me, a temporal (vs spatial) encoding does not help. I would expect a temporal encoding to have developed a topology, upon whose relative stasis the system is equally dependent to draw out meanings. It seems, to me, that another level of indirection may be required to map onto one another such previously distinct systematic relations. On the other hand, perhaps such inferences "by metaphor" are not as automatic as I might believe. It that case it becomes more plausible to see these as a metalevel in which systematic correspondences are established between bindings in the two realms of metaphor. Then, within those binding legitimizing invariances, systematic relations from one domain may readily apply to the other and our directed attention, or wandering gaze, is used to draw out new inferences from within one domain or the other. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From bryan at cog-tech.com Sun Aug 16 22:27:46 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 22:27:46 -0400 Subject: Structured connectionist architectures In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> (message from Tony Plate on Sun, 16 Aug 1998 20:30:59 +1200) Message-ID: <199808170227.WAA32737@cti2.cog-tech.com> It seems that there are quite a few people working on structured connectionist approaches to symbolic reasoning. I would like to know if anyone has put together a (annotated?) bibliography on such research. On 16Aug98, Tony Plate wrote (was Re: Connectionist symbol processing: any progress?): > One very interesting model of analogical processing that was > presented at the workshop in Bulgaria (in July) was John Hummel and > Kieth Holyoak's LISA model (ref at end). This model uses > distributed representations for roles and fillers, binding them > together with temporal synchrony, and achieves quite impressive > results (John, in case you're listening, this is not to say that I > think temporal binding is the right way to go, but it's an > impressive model and presents a good challenge to other > approaches.) If none exists, I would be more that willing to compile one myself if people will contribute entries / pointers to their own work. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From Tony.Plate at MCS.VUW.AC.NZ Mon Aug 17 06:59:45 1998 From: Tony.Plate at MCS.VUW.AC.NZ (Tony Plate) Date: Mon, 17 Aug 1998 22:59:45 +1200 Subject: Connectionist symbol processing: any progress? In-Reply-To: Your message of "Sun, 16 Aug 1998 22:04:25 -0400." <199808170204.WAA32648@cti2.cog-tech.com> Message-ID: <199808171059.WAA31817@rialto.mcs.vuw.ac.nz> Bryan B. Thompson writes: > ... I am not certain that > > [snip description of my scheme ...] > >is inherently different from a spatial approach and, hence, a localist >approach itself. You need to have enough dimensionality to represent >the key features as well as enough to multiply them out by the key >relational features -- quite a few dimensions [snip ...] > >If I am willing to call both temporal encoding and spatial encoding >schemes localist, then what would I consider "distributed?" To the >extent which this is a meaningful distinction, I would have to say >that "distributed" refers to the equi-presence of the encoding of an >entity or compositional relation among all elements of the >representation, e.g., equally present in all internal variables in a >recurrent network. [snip ...] Actually, Holographic Reduced Representations (HRRs) are an "equi-present" code -- everything represented is represented over all of the units. Suppose you have a vector X which represents some structure. Then you can take just the first half of X, and it will also represent that structure, though it will be noisier. This same property is shared by Kanerva's binary spatter-code and may be shared by some of the codes Ross Gayler has been developing. The dimensionality required is high -- for HRRs it's in the hundreds to thousands of elements. But, HRRs have an interesting scaling property -- toy problems involving a just a couple dozen relations might require a dimensionality of 1000, but the dimensionality doesn't need to increase much (to 2 or 4 thousand) to handle problems involving tens of thousands of relations. Yes, I agree fully that metaphor and analogy are intriguing examples of structural processing, and I believe it could be very fruitful to investigate connectionist processing for them. Tony Plate From FRYRL at f1groups.fsd.jhuapl.edu Mon Aug 17 09:45:07 1998 From: FRYRL at f1groups.fsd.jhuapl.edu (Fry, Robert L.) Date: Mon, 17 Aug 1998 09:45:07 -0400 Subject: FW: Connectionist symbol processing: any progress? Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > Lev, > > Okay, so we agree on the following: > > Recurrent ANNs have the computational power required. The only > thing at issue is the learning algortihm. > > Lev Goldfarb responded: >"The ONLY thing at issue" IS the MAIN thing at issue, because the >simulation of the Turing machine is just a clever game, while an adequate >model of inductive learning should, among many other things, change our >understanding of what science is (see, for example, Alexander Bird, >Philosophy of Science, McGill-Queen's University Press, 1998). I have not seen this reference, but will certainly seek it. I certainly agree with it and especially so in the context of connectionists symbolic processing. Consider a computational paradigm where a single- or multiple-neuron layer is viewed as as information channel. It is different, however, from classical Shannon channels in that the neuron transduces information (viz. transmission) from input to an internal representation which is in turn used to select an output code. In a conventional Shannon channel, a channel input code is selected and then inserted into a channel which will degrade this information relative to a receiver that seeks to observe it. That is, one can distinguish between a communications system that effects the transmission of information and a physical system that effects the transuction of information. The engineering objective (as stated by Shannon) was to maximize the entropy of the source and match this to the channel capacity. Alternative, consider a neural computational paradigm where the computational objective is to maximize the information transduced and match this to the output entropy of the neuron. That is, transduction and transmission are complementary processes of information transfer. Information transfer from physical system to physical system requires both. It is interesting that Warren Weaver who co-authored the second chapter in the classic 1949 book "Theory of Communication" recognized this distinction and even made the following statement: "The word communication will be used here in a very broad sense to include all procedures by which one mind may effect another." This is a very interesting choice of words. Why is such a perspective important? Does it provide an unambiguous way of defining learning, symbols, input space, output space, computational objective/metrics, and an inductive theory of neural computation? The neural net community is often at odds with itself regarding having common bases of definitions and interpretations for these terms. After all, regardless of the learning objective function or error criterion, biological and artificial neurons, through learning, must either modify what they measure, e.g., synaptic efficiacies and possibly intradendritic delays, modify what signals they generate for a given input, e.g., variations in firing threshold, or a combination of these. From sgallant at kstream.com Mon Aug 17 12:29:38 1998 From: sgallant at kstream.com (Steve Gallant) Date: Mon, 17 Aug 1998 11:29:38 -0500 Subject: Connectionist symbol processing: any progress? In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> References: Message-ID: <3.0.5.32.19980817112938.008abca0@pluto.kstream.com> In addition to the work on Latent Semantic Indexing mentioned by Tony Plate, there is a body of work involving 'Context Vectors'. This approach was specifically motivated by an attempt to capture semantic information in fixed-length vectors, and is based upon work I did at Northeastern U. A good overview of LSI and Context vectors can be found in: Caid WR, Dumais ST and Gallant SI. Learned vector-space models for document retrieval. Information Processing and Management, Vol. 31, No. 3, pp. 419-429, 1995. and the original source was: Gallant, S. I. A Practical Approach for Representing Context And for Performing Word Sense Disambiguation Using Neural Networks. Neural Computation, Vol. 3, No. 3, 1991, 293-309. Over the last several years HNC Software has further developed and commercialized this approach, forming a division called Aptex. Steve Gallant At 08:30 PM 8/16/98 +1200, Tony Plate wrote: > >Work has been progressing on higher-level connectionist >processing, but progress has not been blindingly fast. As >others have noted, it is a difficult area. > >One of things that has recently renewed my interest in the >idea of using distributed representations for processing >complex information was finding out about Latent Semantic >Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method >for taking a large corpus of text and constructing vector .. Steve Gallant Knowledge Stream Partners 148 State Street Boston, MA 02109 tel: 617/742-2500, x562 fax: 617/742-5820 email: sgallant at kstream.com From cdr at lobimo.rockefeller.edu Mon Aug 17 14:33:32 1998 From: cdr at lobimo.rockefeller.edu (George Reeke) Date: Mon, 17 Aug 1998 14:33:32 -0400 Subject: Connectionist symbol processing: any progress? In-Reply-To: Mitsu Hadeishi "Re: Connectionist symbol processing: any progress?" (Aug 16, 7:03pm) References: <35D78F73.FC2C6E08@ministryofthought.com> Message-ID: <980817143332.ZM11217@grane.rockefeller.edu> On Aug 16, 7:03pm, Mitsu Hadeishi wrote: > The point I am making is simply that after one has transformed the > input space, two points which begin "close together" (not > infinitesimally close, but just close) may end up far apart and vice > versa. The mapping can be degenerate, singular, etc. Why is the > metric on the initial space, then, so important, after all these > transformations? Distance measured in the input space may have very > little correlation with distance in the output space. I can't help stepping in with the following observation: The reason that distance in the input space is so important is that the input space is the real world. It is generally (not always, of course) useful for biological organisms to make similar responses to similar situations--this is what we call "generalization". For this reason, whatever kind of representation is used, it probably should not distort the real-world metric too much. It is perhaps too easy when thinking in terms of mathematical abstractions to forget what the purpose of all these transformations might be. Regards, George Reeke Laboratory of Biological Modelling The Rockefeller University 1230 York Avenue New York, NY 10021 phone: (212)-327-7627 email: reeke at lobimo.rockefeller.edu From sirosh at hnc.com Mon Aug 17 14:45:30 1998 From: sirosh at hnc.com (Sirosh, Joseph) Date: Mon, 17 Aug 1998 11:45:30 -0700 Subject: What have neural networks achieved? Message-ID: Michael, A significant commercial success of neural networks has been in credit card fraud detection. The Falcon credit card fraud detection package, developed by HNC Software Inc. of San Diego (http://www.hnc.com/), uses supervised neural networks, covers over 260 million credit cards worldwide, and generates several tens of millions in annual revenue. Attached is a corporate blurb that gives more info about HNC and some of its products. There's more info on the company web page. Sincerely, Joseph Sirosh Senior Staff Scientist Exploratory R&D Group HNC Software Inc. ================================= Headquartered in San Diego, California, HNC Software Inc. (NASDAQ: HNCS) is the leading vendor of computational intelligence software solutions for the financial, insurance, and retail markets, and U.S. Government customers. HNC Software and its subsidiaries - Risk Data Corporation, CompReview, Aptex, and Retek - use advanced technologies such as neural networks, context vector analysis, and expert rules to deliver powerful solutions for complex pattern recognition and predictive modeling problems. For the U.S. Government, HNC has developed systems for content based text retrieval, multimedia information retrieval, image understanding, and intelligent agents. For commercial markets, HNC is the leading supplier of credit card fraud detection systems, with 23 of the 25 largest U.S. financial institutions being HNC customers. HNC also develops a broad spectrum of additional products, including solutions for profitability analysis, bankruptcy prediction, worker's compensation claims management, retail information management, and database mining. Since its founding in 1986, HNC has grown along with its product offerings and, as of the end of fiscal year 1997, had over 700 employees and revenues of $113 million. ================================== > ---------- > From: Michael A. Arbib > Sent: Monday, August 17, 1998 11:28 AM > To: Sirosh, Joseph > Subject: What have neural networks achieved? > > > Recently, Stuart Russell addressed the following query to Fellows of the > AAAI: > > > > > This Saturday there will be a debate with John McCarthy, David Israel, > > > Stuart Dreyfus and myself on the topic of > > > "How is the quest for artificial intelligence progressing?" > > > This is widely publicized, likely to be partially televised, > > > and will be attended by a lot of journalists. > > > > > > For this, and for AAAI's future reference, I'd like to collect > > > convincing examples of progress, particularly examples that will > > > convince journalists and the general public. For now all I need > > > is a URL or other accessible pointer and a one or two sentence > > > description. (It does not *necessarily* have to be your own work!) > > > Pictures would be very helpful. > > > > This spurs me as I work on the 2nd edition of the Handbook of Brain > Theory > > and Neural Networks (due out in 2 years or so; MIT Press has just issued > a > > paperback of the first edition) to pose to you two related questions: > > > > a) What are the "big success stories" (i.e., of the kind the general > public > > could understand) for neural networks contributing to the understanding > of > > "real" brains, i.e., within the fields of cognitive science and > > neuroscience. > > > > b) What are the "big success stories" (i.e., of the kind the general > public > > could understand) for neural networks contributing to the construction > of > > "artificial" brains, i.e., successfully fielded applications of NN > hardware > > and software that have had a major commercial or other impact? > > > > > > > > ********************************* > > Michael A. Arbib > > USC Brain Project > > University of Southern California > > Los Angeles, CA 90089-2520, USA > > arbib at pollux.usc.edu > > (213) 740-9220; Fax: 213-740-5687 > > http://www-hbp.usc.edu/HBP/ > > > > > > > -- > From goldfarb at unb.ca Mon Aug 17 17:09:01 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Mon, 17 Aug 1998 18:09:01 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> Message-ID: On Sun, 16 Aug 1998, Tony Plate wrote: > One of things that has recently renewed my interest in the > idea of using distributed representations for processing > complex information was finding out about Latent Semantic > Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method > for taking a large corpus of text and constructing vector > representations for words in such a way that similar words > are represented by similar vectors. LSA works by > representing a word by its context (harkenning back to a > comment I recently saw attributed to Firth 1957: "You shall > know a word by the company it keeps" :-), and then reducing > the dimensionality of the context using singular value > decomposition (SVD) (v. closely related to principal > component analysis (PCA)). The vectors constructed by LSA > can be of any size, but it seems that moderately high > dimensions work best: 100 to 300 elements. In connection with the above, in my Ph.D. (published as "A new approach to pattern recognition", in Progress in Pattern recognition 2, ed. Kanal and Rosenfeld, North-Holland, 1985, pp. 241-402) I have proposed to replace the PATTERN RECOGNITION PROBLEM formulated in an input space with a distance measure defined on it by the corresponding problem in a UNIQUELY constructed pseudo-Euclidean vector space (through a uniquely constructed isometric, i.e. distance preserving, embedding of the training set, using, of course SVD of the distance matrix). The classical recognition techniques can then be generalized to the pseudo-Euclidean space and the PATTERN RECOGNITION PROBLEM can then be solved more efficiently than in a general distance space setting. The model is OK, IF YOU HAVE THE RIGHT DISTANCE MEASURE, i.e. if you have the distance measure that capture the CLASS representation and therefore provides a good separation of the class from its complement. However, in general, WHO WILL GIVE YOU THE "RIGHT" DISTANCE MEASURE? I now believe that the construction of the "right" distance measure is a more basic, INDUCTIVE LEARNING, PROBLEM. In a classical vector space setting, this problem is obscured because of the rigidity of the representation space (and, as I have mentioned earlier, because of the resulting uniqueness of the metric), which apparently has not raised any substantiated suspicions in non-cognitive sciences. I strongly believe that this is due to the fact that the classical measurement processes are based on the concept of number and therefore as long as we rely on such measurement processes we are back where we started from--vector space representation. On Sun, 16 Aug 1998, Mitsu Hadeishi wrote: > It is quite often possible to describe one representation in terms of another; symbolic in > terms of numbers, and vice-versa. What does this prove? You can say numbers are an > alphabet with only one letter; I can describe alphabets with numbers, too. > > The real question is, which representation is natural for any given problem. Obviously > symbolic representations have value and are parsimonious for certain problem domains, or > they wouldn't have evolved in nature. But to say your discovery, great as it might be, is > the only "natural" representation seems rather strange. Clearly, mechanics can be > described rather elegantly using numbers, and there are lots of beautiful symmetries and > so forth using that description. I am willing to believe other descriptions may be better > for other situations, but I do not believe that it is reasonable to say that one can be > certain that any given representation is *clearly* more natural than another. It depends > on the situation. Symbolic representations have evolved, but so have numeric > representations. They have different applications, and you can transform between them. > Is one fundamentally "better" than another? Maybe better for this or that problem, but I > do not believe it is reasonable to say they are better in some absolute sense. > > I am a "representation agnostic." (Mitsu, my apologies for the paragraph in italics in my last message: I didn't intend to "shout".) Concluding my brief discussion of the "connectionist symbol processing", I would like to say that I'm not at all a "representation agnostic". Moreover, I believe that the above "agnostic" position is a defensive mechanism that the mind has developed in the face of the mess that has been created out of the representation issues during the last 40 years. During this time, with the full emergence of computers, on the one hand, the role of non-numeric representations has begun to increase (see, for example, "Forms of Representation", ed. Donald Peterson, Intellect Books, 1996) and, at the same time, partly due to the disproportionate and inappropriate influence of the computability theory (again, related to the former), the concept of representation became relativized, as Mitsu so succinctly and quite representatively articulated above and throughout the entire discussion. Computability theory (and, ironically, the entire logic) has not dealt with the representational issues, because, basically, it has ignored the nature of intelligent computational processes, and thus, for example, the central, I believe, issue of how to construct the inductive class representation has not been addressed within it. My purpose for participating in this brief discussion (spread over the several messages) has been to strongly urge both theoretical and applied cognitive scientists to take the representation issue much more seriously and treat it with all the respect one can muster, i.e. to assume that the input, or representation, space is all we have and all we will ever have, and, as the mathematical (not logical) tradition of the past several thousand years strongly suggests, the operations of the representation space "make" this space. All other operations not related to the original space operations become then essentially invisible. For us, this path leads (unfortunately, very slowly) to a considerably more "non-numeric" mathematics that has been historically the case so far, and, at the same time, it inevitably leads to the "symbolic", or inductive, measurement processes, in which the outcome of the measurement process is not a number but a structured entity which we call "struct". Such measurement processes appear to be far-reaching generalizations of the classical, or numeric, measurement processes. Best regards and cheers, Lev From max at currawong.bhs.mq.edu.au Mon Aug 17 17:42:16 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Tue, 18 Aug 1998 07:42:16 +1000 (EST) Subject: connectionist symbol processing Message-ID: <199808172142.HAA20682@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 1442 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/ba92a509/attachment.ksh From zhuh at santafe.edu Mon Aug 17 19:33:13 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Mon, 17 Aug 1998 17:33:13 -0600 (MDT) Subject: Evaluating learning algorithms (Was Re: Connectionist symbol processing: any progress?) In-Reply-To: <35D65650.3B387A0D@ministryofthought.com> Message-ID: Dear Connectionists, First, I apologize for jumping into an already lengthy debate between Lev Goldfarb and Mitsu Hadeishi. I will try to be succinct. I hope you will find the following worth reading. To begin with, let me claim that it is not contradictory to say: 1. The most important issue IS performance of learning rules. 2. Some quantitative measurement (loosely called "metric") is needed. 3. There is no intrinsic Euclidean metric on the space of learning algorithms. 4. The geometry and topology of input space is generally irrelevant. Now let me explain why they must be true, and what kind of theory can be developed to satisfy all of them: The key observation is that learning algorithms act on the space of probability distributions. Unless we are doing rote learning, we cannot assume the data are generated by a deterministic mechanism. Therefore the proper measurement should be derived from some geometry of the space of probability distributions. Now it is clear that even for a discrete set X, the space of functions on X is still a linear space equipped with norms and so on, and the space of probability distributions on X is still a differentiable manifold with intrinsic geometric structures. In fact the function space case can always be regarded as a special case of probability space. The space of probability distributions is not a Euclidean space. However, the beautiful theory of Information Geometry developed by Amari and others shows that it behaves almost as if it has a squared distance. For example, there is a Pythagorean Theorem for cross-entropy that enables us to do things very similar to taking averages and projecting on a linear subspace in an Euclidean space. Information Geometry fully specifies the amount of information that is lost by various computations. (A learning algorithm tries to reduce data with minimum reduction of information.) However, our information is not solely contained in the data. Different conclusions can be drawn from the same data if different statistical assumptions are made. Such assumptions can be specified by a prior (a distribution of all the possible distributions). Technically this is called the complete class theorem. The prior and the data should be consistently combined in a Bayesian method. It can be shown that learning algorithm can be completely quantitatively evaluated in this framework. The input space is generally irrelevant because usually we only want to learn the conditional distribution of output for given input. In the case of unsupervised learning, such as independent component analysis, it is still the space of distributions that is relevant. Again, information geometry has helped to make several breakthroughs in recent years. (Check for papers by Amari, Cardoso, Sejnowski and many others. See, eg., http://www.cnl.salk.edu/~tewon/ica_cnl.html.) The following short paper (4 pages) contains an outline with a little bit more technical detail. The numerous articles by Prof Amari, and esp his 1985 book, should prove extremely useful regarding information geometry. H. Zhu, Bayesian geometric theory of learning algorithms. Proc. of Intnl. Conf. Neural Networks (ICNN'97), Vol.2, pp.1041-1044. Houston, 9-12 June, 1997. ftp://ftp.santafe.edu/pub/zhuh/ig-learn-icnn.ps Hope you enjoyed reading to this point. Huaiyu -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From Sougne at forum.fapse.ulg.ac.be Tue Aug 18 04:40:13 1998 From: Sougne at forum.fapse.ulg.ac.be (Jacques Sougne) Date: Tue, 18 Aug 1998 10:40:13 +0200 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Hi, I am working on modeling deductive reasoning using a distributed network of spiking nodes. Variable binding is achieved by temporal synchrony while it is not a new technique, the use of distributed representation, and the way it solves the problem of multiple instantiation is new. I called the model INFERNET I obtained good results with conditional reasoning even with negated conditional. all these forms: A=>B, A; A=>B, ~A; A=>B, B; A=>B, ~B A=>~B, A; A=>~B, ~A; A=>~B, ~B; A=>~B, B ~A=>B, ~A; ~A=>B, A; ~A=>B, B; ~A=>B, ~B ~A=>~B, ~A; ~A=>~B, A; ~A=>~B, ~B; ~A=>~B, B The INFERNET performance fit human data which are sensitive to negation. The effect of negation is often referred as negative conclusion bias. I also worked on problem requiring multiple instantiations (see Sougne, 1998a; Sougne, 1998b). In INFERNET multiple instantiation is achieved by using the neurobiological phenomena of period doubling. Nodes pertaining to a doubly instantiated concept will sustain two oscillation. This means that these nodes will be able to synchronize with two different set of nodes. The INFERNET performance seems to fit human data for problems requiring multiple instantiation like: Mark loves Helen and Helen loves John. Who is jealous of whom? Due to distributed representation I also found an effect of similarity of the concepts used in deductive tasks which are confirmed by empirical evidences. I also found an interesting effect of noise. When white noise is added in the system (and if it is not too important) the performance of the system is improved. This phenomenon is known as Stochastic resonance (see Levin & Miller 1996, Sougne, 1998b). Description of my work can be found in: Sougne, J. (1996). A Connectionist Model of Reflective Reasoning Using Temporal Properties of Node Firing. Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society. (pp. 666-671) Mahwah, NJ: Lawrence Erlbaum Associates. Sougn, J. (1998). Connectionism and the problem of multiple instantiation. Trends in Cognitive Sciences, 2, 183-189. Sougn, J. (1998). Period Doubling as a Means of Representing Multiply Instantiated Entities. Proceedings of the twentieth Annual Conference of the Cognitive Science Society. (pp. 1007-1012) Mahwah, NJ: Lawrence Erlbaum Associates. Sougne, J. and French, R. M. (1997). A Neurobiologically Inspired Model of Working Memory Based on Neuronal Synchrony and Rythmicity. In J. A. Bullinaria, D. W Glasspool, and G. Houghton (Eds.) Proceedings of the Fourth Neural Computation and Psychology Workshop: Connectionist Representations. London: Springer-Verlag. some preliminary versions are available at: http://www.fapse.ulg.ac.be/Lab/Trav/jsougne.html Some of the recently collected data are not yet published, if you are interested, you can contact me. j.sougne at ulg.ac.be References Sougn, J. (1998a). Connectionism and the problem of multiple instantiation. Trends in Cognitive Sciences, 2, 183-189. Sougn, J. (1998b). Period Doubling as a Means of Representing Multiply Instantiated Entities. Proceedings of the twentieth Annual Conference of the Cognitive Science Society. (pp. 1007-1012) Mahwah, NJ: Lawrence Erlbaum Associates. Levin, J. E. and Miller, J. P. (1996). Broadband neural encoding in the cricket cercal sensory system enhanced by stochastic resonance. Nature, 380, 165-168. Jacques From Rob.Callan at solent.ac.uk Tue Aug 18 07:08:48 1998 From: Rob.Callan at solent.ac.uk (Rob.Callan@solent.ac.uk) Date: Tue, 18 Aug 1998 12:08:48 +0100 Subject: Connectionist symbol processing: any progress? Message-ID: <80256664.003DC810.00@hercules.solent.ac.uk> Dear Bryan I was interested to read your reponse to Tony Plate's message: "Tony Plate's response is interesting and I, for one, will have to give it some thought. I am not certain that > concepts, respectively. For example, one can build a distributed > representation for a shape configuration#33 of "circle above > triangle" as: config33 = vertical + circle + triangle > + > ontop*circle + > below*triangle > > By using an appropriate multiplication > operation (I used > circular, or wrapped, convolution), the > reduced > representation of the compositional concept (e.g., > config33) > has the same dimension as its components, and can > readily be > used as a component in other higher-level relations. > Quite is inherently different from a spatial approach and, hence, a localist approach itself. You need to have enough dimensionality to represent the key features as well as enough to multiply them out by the key relational features -- quite a few dimensions, even if some of that dimensionality is pushed off into numerical precision..." I think this point "You need to have enough dimensionality to represent the key features" has often been overlooked. I am speaking in particular about RAAM's of which I have most experience. One of the great attractions of reduced representations is their potential to be used in holistic processes. However, it appears that the greater the 'reduction' the harder it is for holistic processing. Boden & Niklasson (1995) showed that for a set of tree structures encoded with a RAAM, the structure was maintained but the influence of constituents was not necessarily available for holistic processing. About 3 years ago we developed (S)RAAM (simplified RAAM - see Callan & Palmer-Brown 1997) which uses PCA and a recursive procedure to produce matrices that simulate the first and second weight-layers of a RAAM. Unlike RAAMs, (S)RAAMs cannot reduce the 'representational width' beyond the redundancy present in the training set. One of my student's (John Flackett) has repeated Boden and Niklasson's experminent with (S)RAAM and results (unsurprisingly) show a significant improvement over their RAAM. The action of the recursive process also appears to impose a weighting of the constituents but this is to be further explored. The weighting may prove useful for some tasks (e.g., planning) and so is not necessarily a bad thing for all forms of holistic processing. It is also clear to me that (S)RAAMs have no capability to exhibit 'strong systematicity' and I believe the same is true of RAAMs. I am not ruling out the possibility of strong systematic behavior when RAAMs etc., are used in a modular system (some impressive results were demonstrated by Niklasson & Sharkey 1997). For the general reading list, two recent papers that offer some interesting discussion are: Steven Phillips - examines systematicity in feedforwad and recurrent networks - ref below James Hammerton - general discussion and definition of holistic computation - ref below Callan R, Plamer-Brown D (1997). (S)RAAM: An Analytical Technique for Fast and Reliable Derivation pf Connectionist Symbol structure Representations. Connection Science, Vol 9, No 2. BodenM, Niklasson L (1995). Feature of Distributed Representations for Tree-structures: A Study of RAAM. Presented at the 2nd Swedish Conference on Connectionism. Published in Current trends in Connectionism (Niklasson & Boden eds.) Lawrence Erlbaum Associates. Niklasson L, sharkey N E (1997) Systematicity and generalization in compositional connectionist representations, in G Dorffner (ed), Neural Networks and a New artificial Intelligence. International Thomson Computer Press. Phillips S (1998). are feedforward and Recurrent Networks Systematic? Analysis and Implications for a Connectinist Cognitive Architecture. Connection Science, Vol 10, No 2. Hammerton J (1998) Holistic Computation: Reconstructing a Muddled Concept. Connection Science, Vol 10, No 1. From scheler at informatik.tu-muenchen.de Tue Aug 18 07:37:49 1998 From: scheler at informatik.tu-muenchen.de (Gabriele Scheler) Date: Tue, 18 Aug 1998 13:37:49 +0200 Subject: Connectionist symbol processing: any progress? Message-ID: <98Aug18.133754+0200_met_dst.7649-22196+91@papa.informatik.tu-muenchen.de> As the question of metrics in pattern recognition seems to be of some controversy, here is my point of view: It is quite possible to construct algorithms for learning the metric of some set of patterns in the supervised learning paradigm. This means if we have a set of patterns for class A and a set of patterns for class B we can induce a metric such that class A patterns are close (similar) to each other and dissimilar to class B patterns. This metric will usually be rather distinct from metrics based on numeric properties, such as the Euclidean metric. It is especially useful in the case of binary patterns, which code some features with several distinct ("symbolic") values. Such a metric can be rather involved, for instance the metrics that determine phonemic similarity in different languages are quite different. (Think of "r" and "l" to speakers of Indo-european and some Asian languages). This approach has therefore been applied to the question of how phonemes (classes) relate to phonetic features (pattern sets). However, I believe as has also been pointed out by Pao and possibly Kohonen, the problems of learning distance metrics or of finding hypersurfaces (as in back-propagation) are in substance related. In the first case, we have a distorted topology (with respect to Euclidean space) but simple dividing lines (such as circles) , in the second case the toplogy stays fixed (euclidean space), but dividing surfaces may be rather complex. Although in certain cases, we may prefer one method rather than the other, the question of symbolic vs. non-symbolic ("numeric"?) representations really has not much to do with it. Recall that back-propagation is one method of a universal function approximation, and you find that any class distinction is approximable provided you have found a sufficient and suitable training set. (which is of course the practical problem.) Nonetheless efforts to build pattern classification schemes based on an induction of different metrics for different problems are I believe really interesting and may change some ideas on what constitute "easy" and "hard" problems. (For instance I agree with Lev, that classification according to parity, as well as several symmetry etc. problems become very easy with distance-measure-based approaches.) Gabriele References: Pao, Y.H.: Adaptive Pattern recognition and Neural Networks. Addison-Wesley, 1989. Kohonen, T.: Self-Organization and Associative Memory. Springer 1989. Scheler,G: Feature Selection with Exception Handling- An Example from Phonology, In; Trappl,R. (ed.) Proceedings of EMCSR 94, Springer, 1994. Scheler, G: Pattern Classification with Adaptive Distance Measures, Tech Rep FKI-188-94, 1994. (some more references at www7.informatik.tu-muenchen.de/~scheler/publications.html). From rsun at research.nj.nec.com Tue Aug 18 12:46:00 1998 From: rsun at research.nj.nec.com (Ron Sun) Date: Tue, 18 Aug 1998 12:46:00 -0400 Subject: CFP: Journal of Cognitive Systems Research Message-ID: <199808181646.MAA08213@pc-rsun.nj.nec.com> Subject: Call for Papers: new electronic Journal of Cognitive Systems Research CALL FOR PAPERS Journal of Cognitive Systems Research Editors-in-Chief Ron Sun E-mail: rsun at cs.ua.edu Department of Computer Science and Department of Psychology University of Alabama Tuscaloosa AL, USA Vasant Honavar E-mail: honavar at cs.iastate.edu Department of Computer Science Iowa State University USA Gregg Oden E-mail: gregg-oden at uiowa.edu Department of Psychology University of Iowa USA The Journal of Cognitive Systems Research covers all topics in the study of cognitive processes, in both natural and artificial systems: Knowledge Representation and Reasoning Learning Perception Action Memory Problem-Solving and Cognitive Skills Language and Communication Agents Integrative Studies of Cognitive Systems The journal emphasizes the integration/synthesis of ideas, concepts, constructs, theories, and techniques from multiple paradigms, perspectives, and disciplines, in the analysis, understanding and design of cognitive and intelligent systems. Contributions describing results obtained within the traditional disciplines are also sought if such work has broader implications and relevance. The journal seeks to foster and promote the discussion of novel approaches in studying cognitive and intelligent systems. It also encourages cross-fertilization of disciplines, by publishing high-quality contributions in all of the areas of study, including artificial intelligence, linguistics, psychology, psychiatry, philosophy, system and control theory, anthropology, sociology, biological sciences, and neuroscience. The scope of the journal includes the study of a variety of different cognitive systems, at different levels, ranging from social/cultural cognition, to individual cognitive agents, to components of such systems. Of particular interest are theoretical, experimental, and integrative studies and computational modeling of cognitive systems, at different levels of detail, and from different perspectives. Please send submissions in POSTSCRIPT format by electronic mail to one of the three co-Editors-in-Chief. Note The journal transends traditional disciplinary boundaries, and considers contributions from all relevant disciplines and approaches. The key is the quality of the work and the accessibility and relevance to readers in different disciplines. The first issue of this new on-line journal, published by ElsevierScience, will appear in early 1999. In addition to this electronic journal, the issues will also be printed and bound as archival volume. Published papers will be considered automatically for inclusion in specially edited books on Cognitive Systems Research. For further information, see: http://cs.ua.edu/~rsun/journal.html -------------------- action editors: John Barnden, School of Computer Science, University of Birmingham, U.K.\\ William Bechtel, Department of Philosophy, Washington University, St. Louis, USA.\\ Rik Belew, Computer Science and Engineering Department, University of California, San Diego, USA.\\ Mark H. Bickhard, Department of Psychology, Lehigh University, USA.\\ Deric Bownds, Dept. of Zoology, University of Wisconsin, Madison, USA. \\ David Chalmers, Department of Philosophy, University of California, Santa Cruz, USA. \\ B. Chandrasekaran, Department of Computer and Information Science, Ohio State University, USA.\\ Marco Dorigo, University of Brussels, Brussels, Belgium\\ Michael Dyer, Computer Science Department, University of California, Los Angeles, USA.\\ Lee Giles, NEC Research Institute, Princeton, New Jersey, USA. \\ George Graham, Philosophy Department, University of Alabama at Birmingham, Birmingham, AL, USA.\\ Stephen J.Hanson, Psychology Dept., Rutgers University, Newark, New Jersey, USA.\\ Valerie Gray Hardcastle, Dept. of Philosophy, Virginia Polytechnic and State University, Blacksburg, Virginia, USA.\\ James Hendler, Department of Computer Science, University of Maryland, College Park, USA.\\ Stephen M. Kosslyn, Department of Psychology, Harvard University, USA. \\ George Lakoff, Dept. of Linguistics, University of California, Berkeley, USA.\\ Joseph LeDoux, Center for Neuroscience, New York University, New York, USA.\\ Daniel Levine, Department of Psychology, University of Texas at Arlington, USA.\\ Vladimir J. Lumelsky, Robotics Laboratory, Department of Mechanical Engineering, University of Wisconsin, Madison, USA.\\ James Pustejovsky, Brandeis University, Massachusetts, USA.\\ Lynne M. Reder, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.\\ Jude Shavlik, Computer Sciences Department, University of Wisconsin, Madison, USA.\\ Tim Shallice, Department of Psychology, University College, London, UK \\ Aaron Sloman, School of Computer Science, The University of Birmingham, UK. \\ Paul Thagard, Philosophy Department, University of Waterloo, Canada.\\ Leonard Uhr, Computer Sciences Department, University of Wisconsin, Madison, USA.\\ David Waltz, NEC Research Institute, Princeton, NJ, USA.\\ Xin Yao, Dept. of Computer Science, Australian Defense Force Academy, Canberra, Australia.\\ From dsilver at csd.uwo.ca Tue Aug 18 16:57:14 1998 From: dsilver at csd.uwo.ca (Danny L. Silver) Date: Tue, 18 Aug 1998 16:57:14 -0400 (EDT) Subject: Connectionist symbol processing & work by Baxter Message-ID: <199808182057.QAA05573@church.ai.csd.uwo.ca> Along the lines of Dr. Zhu, I would like to point out the important work of Jon Baxter concerning the "learning of internal represenations" in which he develops a "canonical distortion measure" The CDM is in fact a metric over the input space defined by the probability distribution over a domain of tasks (e.g. character recognition) each of which shares the input space. The metric can be used to measure the similarity of input vectors. The important aspect of Baxter's work is that he shows formally and demonstrates imperically that the CDM for a particular task (or environmental) domain can be LEARNED to the desired level of accuracy if the learner samples sufficiently from the domain of tasks. i.e if the learner "experiences" the environment long enough and well enough. Once learned this CDM metric can be used to facilitate learning any new task from the domain - thus it can be consider a process of "learning to learn". Baxter, in fact demonstrates how the CDM metric can be learned within the hidden node representations of a neural network for a simple task domain. Based on this .. I would conclude that the facility of symbolic representation and logical metrics is largely a function of the domain of tasks under consideration and not necessarily THE best representation or metric. For details please refer to: Jonathan Baxter. Learning Internal Representations. PhD Thesis, Dept. Mathematics and Statss, The Flinders University of South Australia, 1995. Draft copy available in Neuroprose Archive - /pub/neuroprose/Thesis/baxter.thesis.ps.Z Jonathan Baxter. The Canonical Distortion Measure for Vector Quantization and Function Approximation. Learning to Learn, edited by Sebastian Thrun and Lorien Pratt, 1998, Kluwer Academic Publishers, p.159-179 Cheers .. Danny Silver -- ========================================================================= = Daniel L. Silver University of Western Ontario, London, Canada = = N6A 3K7 - Dept. of Comp. Sci. = = dsilver at csd.uwo.ca H: (902)582-7558 O: (902)494-1813 = = WWW home page .... http://www.csd.uwo.ca/~dsilver = ========================================================================= From giles at research.nj.nec.com Tue Aug 18 18:17:41 1998 From: giles at research.nj.nec.com (Lee Giles) Date: Tue, 18 Aug 1998 18:17:41 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: from "Michael A. Arbib" at Aug 14, 98 02:07:20 pm Message-ID: <199808182217.SAA26188@alta.nj.nec.com> Michael Arbib wrote: > > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? > > > > ********************************* > Michael A. Arbib > USC Brain Project > University of Southern California > Los Angeles, CA 90089-2520, USA > arbib at pollux.usc.edu > (213) 740-9220; Fax: 213-740-5687 > http://www-hbp.usc.edu/HBP/ > > For those of you who might have missed it, an entire issue of IEEE TNN was devoted to "practical uses of NNs." The issue was IEEE Transactions on Neural Networks Volume 8, Number 4 July 1997 Table of Contents plus page numbers are below: 827... Neural Fraud Detection in Credit Card Operations Jose R. Dorronsoro, Francisco Ginel, Carmen Sanchez, and Carlos Santa Cruz 835... ANNSTLF--A Neural-Network-Based Electric Load Forecasting System Alireza Khotanzad, Member, IEEE, Reza Afkhami-Rohani, Tsun-Liang Lu, Alireza Abaye, Member, IEEE, Malcolm Davis, and Dominic J. Maratukulam 847... A Deployed Engineering Design Retrieval System Using Neural Networks Scott D. G. Smith, Richard Escobedo, Michael Anderson, and Thomas P. Caudell 852... Modeling Complex Environmental Data Chris M. Roadknight, Graham R. Balls, Gina E. Mills, and Dominic Palmer-Brown 863... Neural Networks and Traditional Time Series Methods: A Synergistic Combination in State Economic Forecasts James V. Hansen and Ray D. Nelson 874... Reliable Roll Force Prediction in Cold Mill Using Multiple Neural Networks Sungzoon Cho, Member, IEEE, Yongjung Cho, and Sungchul Yoon 883... Dynamic Neural Control for a Plasma Etch Process Jill P. Card, Member, IEEE, Debbie L. Sniderman, and Casimir Klimasauskas 902... Application of Neural Networks to Software Quality Modeling of a Very Large Telecommunications System Taghi M. Khoshgoftaar, Member, IEEE, Edward B. Allen, Member, IEEE, John P. Hudepohl, and Stephen J. Aud 910... Neural Intelligent Control for a Steel Plant Gerard Bloch, Member, IEEE, Franck Sirou, Vincent Eustache, and Philippe Fatrez 919... Characterization of Aluminum Hydroxide Particles from the Bayer Process Using Neural Network and Bayesian Classifiers Anthony Zaknich, Member, IEEE 932... Fuzzy Neural Networks for Machine Maintenance in Mass Transit Railway System James N. K. Liu, Member, IEEE, and K. Y. Sin 942... Dynamic Security Contingency Screening and Ranking Using Neural Networks Yakout Mansour, Fellow, IEEE, Ebrahim Vaahedi, Senior Member, IEEE, and Mohammed A. El-Sharkawi, Fellow, IEEE 951... Self-Calibration of a Space Robot Vicente Ruiz de Angulo and Carme Torras 964... Cork Quality Classification System using a Unified Image Processing and Fuzzy-Neural Network Methodology Joongho Chang, Gunhee Han, Jose M. Valverde, Norman C. Griswold, Senior Member, IEEE, J. Francisco Duque-Carrillo, Member, IEEE, and Edgar Sanchez-Sinencio, Fellow, IEEE Those of you who are IEEE members and have access can download these papers from http://opera.ieee.org/opera/browse.html Best regards Lee Giles -- __ C. Lee Giles / Computer Science / NEC Research Institute / 4 Independence Way / Princeton, NJ 08540, USA / 609-951-2642 / Fax 2482 www.neci.nj.nec.com/homepages/giles == From Jon.Baxter at keating.anu.edu.au Tue Aug 18 18:57:52 1998 From: Jon.Baxter at keating.anu.edu.au (Jonathan Baxter) Date: Wed, 19 Aug 1998 08:57:52 +1000 (EST) Subject: Connectionist symbol processing: any progress? Message-ID: <199808182257.IAA22931@reid.anu.edu.au> Forwarded message: > From owner-neuroz at munnari.oz.au Tue Aug 18 19:25:34 1998 > Date: Mon, 17 Aug 1998 18:09:01 -0300 (ADT) > From: Lev Goldfarb > X-Sender: goldfarb at sol.sun.csd.unb.ca > Reply-To: Lev Goldfarb > To: connectionists at cs.cmu.edu, inductive at unb.ca > Subject: Re: Connectionist symbol processing: any progress? > In-Reply-To: <199808160830.UAA08508 at rialto.mcs.vuw.ac.nz> > Message-Id: > Mime-Version: 1.0 > Content-Type: TEXT/PLAIN; charset=US-ASCII > > On Monday August 17, Lev Goldfarb wrote: > > However, in general, WHO WILL GIVE YOU THE "RIGHT" DISTANCE MEASURE? I now > believe that the construction of the "right" distance measure is a more > basic, INDUCTIVE LEARNING, PROBLEM. In a classical vector space setting, > this problem is obscured because of the rigidity of the representation > space (and, as I have mentioned earlier, because of the resulting > uniqueness of the metric), which apparently has not raised any > substantiated suspicions in non-cognitive sciences. I strongly believe > that this is due to the fact that the classical measurement processes are > based on the concept of number and therefore as long as we rely on such > measurement processes we are back where we started from--vector space > representation. Just to add a note on this point of "what is the right distance measure" and "where do you get it": it is reasonably clear that if you are faced with just a single learning problem finding the right distance measure is equivalent to the learning problem itself. After all, in a classification setting the perfect distance measure would set the distance between examples belonging to the same class to zero, and the distance between examples belonging to different clesses to some positive number. One-nearest-neighbour classification with such a distance metric and a training set containing at least one example of each class will have zero error. In contrast, in a "learning to learn" setting where a learner is faced with a (potentially infinite) *sequence* of learning tasks one can ask that the learner learns a distance metric that is in some sense appropriate for all the tasks. I think it is this sort of metric that people are thinking of when they talk about "the right distance measure". For example, in my life I have to learn to recognize thousands of faces, not just a single face. If I learn a distance measure that works for just a single face (say, just distinguish my father from everybody else) then that distance measure is unlikely to be the "right" measure for faces; it would most likely focus on some idiosyncratic feature of his face in order to make the distinction and would thus be unusable for distinuishing faces that don't possess such a feature. However, if I learn a distance measure that works for a large variety of faces, then that distance measure is more likely to focus on the "true" invariants of people's faces and hence has more chance of being the "right" measure. Anyway, to cut a long story short, you can formalize this idea of learning the right distance measure for a number of related tasks---I had papers on this in NIPS and ICML last year. You can also get them from my web page: http://wwwsyseng.anu.edu.au/~jon/papers/nips97.ps.gz http://wwwsyseng.anu.edu.au/~jon/papers/icml97.ps.gz. This idea has turned up in a number of different guises in various places (here is a few): Shimon Edelman. Representation, Similarity and the Chorus of Protoypes. Minds and Machines, 5, 1995. Oehler and Gray. Combining image compression and classification using vector quantization. IEEE Transactions on PAMI. 17(5): 461--473. 1995. Thrun and Mitchell. Learning one more thing. TR CS-94-184, CMU, 1994. Cheers, Jon ------------- Jonathan Baxter Department of Systems Engineering Research School of Information Science and Engineering Australian National University http://keating.anu.edu.au/~jon Tel: +61 2 6279 8678 Fax: +61 2 6279 8688 From curt at doumi.ucr.edu Wed Aug 19 01:49:49 1998 From: curt at doumi.ucr.edu (Curt Burgess) Date: Tue, 18 Aug 98 22:49:49 -0700 Subject: Connectionist symbol processing: any progress? LSA & HAL models Message-ID: <9808190549.AA07717@doumi.ucr.edu> > One of things that has recently renewed my interest in the > idea of using distributed representations for processing > complex information was finding out about Latent Semantic > Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method > for taking a large corpus of text and constructing vector I think LSA is an important approach in this symbol processing debate. A model similar in many ways to LSA is our Hyperspace Analogue to Language (HAL) model of memory (also at NIPS*97 [workshops]). One difference is that LSA (typically) is implemented in a matrix of word by larger text unit dimensions. HAL is a word by word matrix. There are other differences - one being how dimensionality is reduced. One big advantage of HAL and LSA is that they use learning procedures that scale up to real world language problems and thus can use large corpora as input. It would be difficult to put 300 million words through a SRN the number of times required for any learning to take place (!). With global co-occurrence models like HAL or LSA, this scalability isn't a problem. HAL and LSA also use continuous valued vector representations which results in very rich encoding of meaning. We've addressed the scalability issue by comparing HAL's algorithm to a SRN in a chapter available on my lab's web page (http://locutus.ucr.edu/Reprints.html) - get the Burgess & Lund, 1998, under review, document). We show that the same input into a SRN and HAL will get virtually identical output. The beauty of this is that one can use vector representations acquired in a global co-occurrence model in a connectionist model knowing that these vectors are what would likely be produced if they were learned via a connectionist methodology. In this chapter we also address a variety of other related issues (what is similarity? the symbol-grounding problem, the relationship between associations and categorical knowledge, modularity and syntactic constraints, developing asymmetric relationships between words, and, in a limited way, using high-dimensional models to mimic higher-level cognition). The chapter was written to be a little provocative. There are 6 or 7 papers detailing the HAL model available as PDFs and another 6 or 7 that you can order with the reprint order form. The latest issue of Discourse Processes (edited by Peter Foltz) is a special issue on quantitative approaches to language and is full of LSA and HAL papers. I will be editing a special journal issue that will have more HAL and LSA papers (a followup to the high-dimensional semantic space symposium at psychonomics last year). At the SCiP (Society for Computers in Psychology) conference in Nov (the day before Psychonomics), we will have a symposium on high-dimensional semantic memory models and Tom Landauer is giving the keynote talk (titled "How modern computation can turn cognitive psychology into a real science" - I suspect also a little provocative!). I gave the keynote at last years SCiP meeting and this is available on our website and is in the last issue of BRMIC (Burgess, C. (1998). From Simple Associations to the Building Blocks of Language: Modeling Meaning in Memory with the HAL Model. Behavior Research Methods, Instruments, and Computers, 30, 188 - 198.). It's a good brief introduction to the range of problems we've addressed. The HAL and LSA work certainly are related to the "context vector" research that Steve Gallant was talking about. I guess that's enough... Curt --- Dr. Curt Burgess, Computational Cognition Lab Department of Psychology, University of California Riverside, CA 92521-0426 URL: http://locutus.ucr.edu/ Internet: curt at cassandra.ucr.edu MaBellNet: (909) 787-2392 FAX: (909) 787-3985 From Sebastian_Thrun at heaven.learning.cs.cmu.edu Wed Aug 19 19:17:01 1998 From: Sebastian_Thrun at heaven.learning.cs.cmu.edu (Sebastian Thrun) Date: Wed, 19 Aug 1998 19:17:01 -0400 Subject: Organize a workshop at IJCAI-99? Message-ID: Dear Connectionists: This is to bring to your attention a great opportunity to organize a workshop at the forthcoming IJCAI conference (IJCAI stands for International Joint Conference on Artificial Intelligence), which will take place 31 July - 6 August 1999 in Stockholm, Sweden. IJCAI is a leading AI-conference, and in recent years there has been a good deal of overlap with meetings such as NIPS and Snowbird (e.g., work on learning, Bayesian methods). Organizing a workshop at IJCAI is a great way to get people outside the field involved in the type of work carried out in the "connectionsist" community. For IJCAI-99, we will particularly welcome workshop proposals with cross-cutting themes. If you are interested in submitting a proposal, please consult the Web page http://www.dsv.su.se/ijcai-99/ Deadline for proposals is Oct 1, 1998. Proposals will be selected on a competitive basis. Workshop topics of past IJCAI conferences can be found at http://www.ijcai.org/past/. Sebastian Thrun (workshop chair, IJCAI-99) From marks at maxwell.ee.washington.edu Wed Aug 19 15:58:43 1998 From: marks at maxwell.ee.washington.edu (Robert J. Marks II) Date: Wed, 19 Aug 1998 12:58:43 -0700 Subject: What have neural networks achieved? In-Reply-To: <199808182217.SAA26188@alta.nj.nec.com> References: Message-ID: <3.0.1.32.19980819125843.00706c44@maxwell.ee.washington.edu> At 06:17 PM 8/18/98 -0400, Lee Giles wrote: >Michael Arbib wrote: > >> >> b) What are the "big success stories" (i.e., of the kind the general public >> could understand) for neural networks contributing to the construction of >> "artificial" brains, i.e., successfully fielded applications of NN hardware >> and software that have had a major commercial or other impact? >> >> >> >> ********************************* >> Michael A. Arbib >> USC Brain Project >> University of Southern California >> Los Angeles, CA 90089-2520, USA >> arbib at pollux.usc.edu >> (213) 740-9220; Fax: 213-740-5687 >> http://www-hbp.usc.edu/HBP/ >> >> > >For those of you who might have missed it, an entire issue of IEEE TNN >was devoted to "practical uses of NNs." The issue was > >IEEE Transactions on Neural Networks >Volume 8, Number 4 July 1997 > Slides from a talk entitled "Neural Networks: Reduction to Practice" are on the web at http://cialab.ee.washington.edu/Marks-Stuff/icnn_97.html-ssi Nutshell summaries of the TNN papers in the special issue are given plus numerous other everyday uses of neural networks. From timmers at nici.kun.nl Thu Aug 20 04:12:26 1998 From: timmers at nici.kun.nl (renee timmers) Date: Thu, 20 Aug 1998 10:12:26 +0200 Subject: job announcement Message-ID: Postdoctoral Researchers in Music Cognition At the Nijmegen Institute of Cognition and Information (NICI) of the Nijmegen University a research team was set up in September 1997, supported by the Dutch Foundation for Scientific Research (NWO) as the PIONIER project "Music, Mind, Machine". This project aims at improving the understanding of the temporal aspects of musical knowledge and music cognition using computational models. The research is interdisciplinary in nature, with contributions from music theory, psychology and computer science. A number of studies is planned, grouped according to the following perspectives: the computational modeling methodology, the music domain itself, and applications of the findings. The methodological studies are concerned with the development of cognitive modeling languages, the study of (sub)symbolic formalisms, the development of programming language constructs for music, and the evaluation of physical metaphors in modeling expressive timing. The domain studies focus on specific temporal aspects of music, such as beat induction, grace note timing, musical expression and continuous modulations in music performance. In these areas both the construction of computational models and their experimental validation are being undertaken. The theoretical results will be applied in e.g., editors for musical expression for use in recording studios. In order to realize these aims, a multi-disciplinary research group was formed, in which teamwork and collaboration play a crucial role. It is expected that all team members are actively involved in building the team and the realization of the project's aims. The demands on the team members is high, conducting innovative and internationally recognized research. However, in return, our stimulating research environment provides adequate training and technical support, including a high-quality infrastructure and recording and music processing facilities. Close contact is maintained with the international community of researchers in this field. More information on the project and a description of the planned studies can be found at http://www.nici.kun.nl/mmm Ref 21.2.98 One postdoc will be responsible for improving an existing connectionist model for quantization and will design and validate this and other models and supervise their implementation. Quantization is the process of separating the categorical, discrete timing components -durations as notated in the musical score- from the continuous deviations in a musical performance. The project has, next to the fundamental aspects (connectionist models of categorical rhythm perception and their empirical validation), an important practical focus and aims at developing a robust component for automatic music transcription systems. The research will be realized at the lab for Medical and Biophysics (MBFYS) and at the Nijmegen Institute for Cognition and Information (NICI), both at the University of Nijmegen and is funded by the Dutch Foundation for Technical Sciences (STW). We are looking for a psychologist with experience in both experimental methods and in computational modeling. Experience with attractor networks is an advantage. Appointment will be full-time for three years, with a possible extension. Ref 21.3.98 The other position requires a Doctorate in Music Theory/Analysis, Psychology, or Music Cognition. A thorough knowledge of the music cognition literature is required, preferably centering on a computational modeling approach. In addition, the candidate needs to have ample practical experience in conducting experiments and a thorough knowldege of music theory. Although the project focuses on musical performance and rhythmic structure, research experience in these domains is not essential. He or she must be able and willing to collaborate with the other members of the team on existing research projects and contribute to the supervision of doctoral level research. The ability to communicate clearly and work as part of a team is crucial. Experience in collaboration with researchers from computer science, artificial intelligence, or music technology would be beneficial, as would some knowledge of these fields. Appointment will be full-time for two years, with a possible extension. The Faculty of Social Sciences intents to employ a proportionate number of women and men in all positions in the faculty. Women are therefore urgently invited to apply. The selection procedure may entail an assessment of collaboration and communication skills. Applications (three copies, in English or Dutch) including a curriculum vitae and a statement about the candidate's professional interests and goals, and one copy of recent work (e.g., thesis, computer program, article) should be mailed before the 1st of November to the Department of Personnel & Organization, Faculty of Social Sciences, Catholic University Nijmegen, P.O.Box 9104, 6500 HE Nijmegen, The Netherlands. Please mark envelope and letter with the appropriate vacancy number. Questions can be addressed to Renee Timmers: timmers at nici.kun.nl From mashouq at ix.netcom.com Thu Aug 20 00:28:31 1998 From: mashouq at ix.netcom.com (Dr. Khalid Al-Mashouq) Date: Wed, 19 Aug 1998 23:28:31 -0500 Subject: What have neural networks achieved? Message-ID: <35DBA5EF.FCB1B000@ix.netcom.com> I am part of ACES (http://www.riyadhweb.com/aces) , a telecommunications and electronics company based in Saudi Arabia. Recently we sold Lucent Technology (AT&T, formerly) a multi-million system to test the Saudi GSM mobile network quality on the voice level (not bit-error rate level). This system is made by ASCOM (http://www.ascom.ch/qvoice) and worldwide accepted. It uses the neural network-fuzzy techniques to assess the quality of the received voice signal without paying the overhead of sending real people in the field to measure the quality. As ASCOM puts it: Using advanced neural networking technology, the fully automatic system is trained to map speech patterns, and to produce quality ratings which correlate over 98% to those produced "unconsciously" by the combination of the human ear and brain. (For technical details about this system and other related systems, see http://www.ascom.ch/qvoice/qos/qqos0000.htm and http://www.ascom.ch/qvoice/car/qcar0000.htm) Hope this information is useful. Khalid Al-Mashouq Visiting professor at CMU, Pittsburgh. King Saud University Riyadh, Saudi Arabia http://www.angelfire.com/ok/almashouq From Yves.Moreau at esat.kuleuven.ac.be Thu Aug 20 11:59:56 1998 From: Yves.Moreau at esat.kuleuven.ac.be (Yves Moreau) Date: Thu, 20 Aug 1998 17:59:56 +0200 Subject: What have neural networks achieved? References: <3.0.1.32.19980819125843.00706c44@maxwell.ee.washington.edu> Message-ID: <35DC47FC.C2476BBA@esat.kuleuven.ac.be> Dear Connectionnists, I would like to point at the homepage of the European project SIENA, which has reviewed applications of neural networks in Europe and presents a number of case studies: http://www.mbfys.kun.nl/snn/siena/cases/ http://www.augusta.co.uk/siena/ Yves Moreau Department of Electrical Engineering Katholieke Universiteit Leuven Kardinaal Mercierlaan 94 B-3001 Leuven Belgium moreau at esat.kuleuven.ac.be Case studies of successful applications ------------------------- Benelux Prediction of Yarn Properties in Chemical Process Technology Current Prediction for Shipping Guidance in IJmuiden Recognition of Exploitable Oil and Gas Wells Modelling Market Dynamics in Food-, Durables- and Financial Markets Prediction of Newspaper Sales Production Planning for Client Specific Transformers Qualification of Shock-Tuning for Automobiles Diagnosis of Spot Welds Automatic Handwriting Recognition Automatic Sorting of Pot Plants Spain/Portugal Fraud detection in credit card transactions Drinking Water Supply Management On-line Quality Modelling in Polymer Production Neural OCR Processing of Employment Demands Neural OCR Personnel Information Processing at Madrid's Delegacion Provincial de Educacion Neural OCR Processing of Sales Orders Neural OCR Processing of Social Security Forms Germany/Austria Predicting Sales of Articles in Supermarkets Automatic Quality Control System for Tile-making Works HERACLAS - Quality Assurance by "listening" Optimizing Facilities for Polymerization Quality Assurance and Increased Efficiency in Medical Projects Classification of Defects in Pipelines A New Method for Computer Assisted Prediciton of Lymphnode-Metastasis in Gastric Cancer Alarm Identification with SENECA Facilities for Material-Specific Sorting and Selection Optimized Dryer-Regulation Application of Neural Networks for Evaluating the Reaction State of Penicillin-Fermenters Substitution of Analysers in Distillation Columns Optical Positioning in Industrial Production Short-Term Load Forecast for German Power Utility Monitoring of Water Dam ZN-Face: Access Control Using Automated Face Recognition Control of Tempering Furnaces France/Italy Helicopter Flight Data Analysis (HFDA) Neural Forecaster for On-line Load Profile Correction UK/Scandinavia For more than 30 UK case studies we refer to the applications portfolio at DTI's NeuroComputing Web ---------------------------------------------------------------- > >Michael Arbib wrote: > > > >> > >> b) What are the "big success stories" (i.e., of the kind the general public > >> could understand) for neural networks contributing to the construction of > >> "artificial" brains, i.e., successfully fielded applications of NN hardware > >> and software that have had a major commercial or other impact? > >> > >> > >> > >> ********************************* > >> Michael A. Arbib > >> USC Brain Project > >> University of Southern California > >> Los Angeles, CA 90089-2520, USA > >> arbib at pollux.usc.edu > >> (213) 740-9220; Fax: 213-740-5687 > >> http://www-hbp.usc.edu/HBP/ > >> > >> > > From stefan.wermter at sunderland.ac.uk Thu Aug 20 15:10:21 1998 From: stefan.wermter at sunderland.ac.uk (Stefan Wermter) Date: Thu, 20 Aug 1998 20:10:21 +0100 Subject: Connectionist symbol processing: any progress Message-ID: <35DC749D.5D7E2E33@sunderland.ac.uk> Jamie Henderson writes: > - Connectionist approaches to processing structural information have made > significant progress, to the point that they can now be justified on > purely empirical/engineering grounds. > - Connectionist methods do solve problems that current non-connectionist > methods have (ad-hoc independence assumptions, sparse data, etc.), > and people working in learning know it. > - Connectionist NLP researchers should be using modern empirical methods, > and they will be taken seriously if they do. I would support Jamie Hendersons view. While it might have been the state of the art 10 years ago to focus on small single networks for toy tasks in isolation, there has been an interesting development of using connectionist networks not only for cognitive modeling but for instance for (language) engineering. Larger modular architectures have been explored (for instance there were several recent issues on modular architectures in the journal connection science, guest-edited by A. Sharkey) and neural networks might also be used in context with other modules in larger systems. And it is useful and necessary to compare with traditional well-known techniques, e.g. n-grams, etc In some of our recent work on the screen system for instance, we have processed speech input from acoustics over syntax and semantics up to dialog levels based on two corpora of several thousand words. All the main processing could be done with neural networks in a modular architecture for a speech/language system. So connectionist techniques are not only useful for modeling specific cognitive constraints well but can also be used successfully for larger tasks like learning text tagging or learning spoken language analysis. Below some references if interested. Wermter S., Weber, V. 1997. SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks, Journal of Artificial Intelligence Research 6(1) p. 35-85 Wermter, S. Riloff, E. Scheler, G. (Ed). 1996. Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing Springer Verlag, Berlin. Wermter S., Meurer M. 1997. Building lexical representations dynamically using artificial neural networks. Proceedings of the International Conference of the Cognitive Science Society, p. 802-807, Stanford. I would be interested to hear if anybody working on neural network techniques has recently developed MODULAR neural techniques in other fields, e.g. for integrating vision and speech, data/text mining, intelligent controllers, learning web agents, neuro-fuzzy reasoning, information extraction and information retrieval or other forms of intelligent processing. best wishes, Stefan ******************************************** Professor Stefan Wermter Research Chair in Intelligent Systems University of Sunderland Dept. of Computing & Information Systems St Peters Way Sunderland SR6 0DD United Kingdom phone: +44 191 515 3279 fax: +44 191 515 2781 email: stefan.wermter at sunderland.ac.uk http://osiris.sunderland.ac.uk/~cs0stw/ ******************************************** From ted.carnevale at yale.edu Thu Aug 20 18:02:35 1998 From: ted.carnevale at yale.edu (Ted Carnevale) Date: Thu, 20 Aug 1998 18:02:35 -0400 Subject: NEURON course at SFN98 Message-ID: <35DC9CFB.6BD6@yale.edu> Short course announcement Title: Using the NEURON Simulation Environment What, where, and when: This is a Satellite Symposium that will be presented at the Society for Neuroscience meeting in Los Angeles, CA, on Saturday, Nov. 7, 1998. Speakers: N.T. Carnevale, M.L. Hines, J.W. Moore, and G.M. Shepherd Description NEURON, which runs under UNIX/Linux, MSWindows, and MacOS, is an advanced simulation environment that handles realistic models of biophysical mechanisms, individual neurons, and networks of cells. In lectures with live computer demonstrations, this course will present information essential for teaching and research applications of NEURON. It will emphasize practical issues that are key to the most productive use of this powerful and convenient modeling tool. Partial list of topics to be covered: * How NEURON separates biologically important features, such as spatio-temporal complexity, from purely numerical concerns like accuracy and stability. * Efficient design, implementation, and use of models, including variable-order variable-timestep integration, and NEURON's latest tools for network simulations. * Using the graphical interface to control and modify simulations, and to analyze data and simulation results without additional programming. * Getting the most out of special features such as vectors and the extensive function library. * Expanding NEURON's repertoire of biophysical mechanisms. * Using NEURON simulations in neuroscience education. * Databases for empirically-based modeling. Each registrant will receive a CD-ROM with software, plus a comprehensive set of notes that includes material which has not yet appeared elsewhere in print. Registration is limited to 55 individuals on a first-come, first-serve basis. Early registration deadline Friday, October 2, 1998 Late registration deadline Friday, October 16, 1998 NO on-site registration will be accepted. For more information and the electronic registration form, see http://www.neuron.yale.edu/sfn98.html --Ted From l.s.smith at cs.stir.ac.uk Fri Aug 21 12:22:37 1998 From: l.s.smith at cs.stir.ac.uk (Dr L S Smith (Staff)) Date: Fri, 21 Aug 98 17:22:37 +0100 Subject: Book from EWNS1 Neuromorphic Systems Workshop Message-ID: <199808211622.RAA26776@tinker.cs.stir.ac.uk> Title: Neuromorphic Systems: Engineering Silicon from Neurobiology Editors: L.S. Smith and A. Hamilton Publisher: World Scientific. 260 pages. Series: Progress in Neural Processing 10. ISBN: 981 02 3377 9 This book is the refereed proceedings of the 1st European Workshop on Neuromorphic Systems, held in August 1997, at Stirling, Scotland. Further details of the book may be found at the www page of the conference, http://www.cs.stir.ac.uk/~lss/Neuromorphic/Info1 or at the publishers site http://www.wspc.com.sg/books/compsci/3702.html *note* that the 2nd European Workshop on Neuromorphic Systems will be held at Stirling, Scotland from 3-5 September 1999. No www page yet. Dr Leslie S. Smith Dept of Computing Science and Mathematics, Univ of Stirling Stirling FK9 4LA, Scotland l.s.smith at cs.stir.ac.uk (NeXTmail and MIME welcome) Tel (44) 1786 467435 Fax (44) 1786 464551 www http://www.cs.stir.ac.uk/~lss/ From juergen at idsia.ch Fri Aug 21 12:28:52 1998 From: juergen at idsia.ch (Juergen Schmidhuber) Date: Fri, 21 Aug 1998 18:28:52 +0200 Subject: PhD student jobs Message-ID: <199808211628.SAA04263@ruebe.idsia.ch> ******** ETH Zurich and IDSIA in Lugano (Switzerland) ******* PhD student positions We are seeking two outstanding PhD candidates for an exciting research project that combines machine learning (reinforcement learning, evolutionary computation, neural nets) and computational fluid dynamics. We intend to tackle problems such as drag minimisation, noise control, etc, using innovative control devices such as synthetic actuators, active skins, etc. This is a joint project of the Institute of Fluid Dynamics at ETH Zurich and the machine learning research institute IDSIA in Lugano (IDSIA ranked among the world's top ten AI labs in the 1997 "X-Lab Survey" by Business Week Magazine). Both are located in Switzerland, origin of the WWW and country with highest citation impact factor as well as most Nobel prizes and supercomputing capacity per capita. We will maintain very active links to Fluid Dynamics and AI institutes at Stanford University and NASA Ames Research Center. We offer an attractive Swiss PhD student salary. Highly qualified candidates are sought with a background in computational sciences, engineering, mathematics, physics or other relevant areas. Applicants should submit : (i) Detailed curriculum vitae, (ii) List of three references (and their email addresses), (ii) Transcripts of undergraduate and graduate (if applicable) studies and (iii) Concise statement of their research interests (two pages max). Candidates are also encouraged to submit their scores in the Graduate Record Examination (GRE) general test (if available). Please send all documents to: Petros Koumoutsakos, www.ifd.mavt.ethz.ch Institute for Fluid Dynamics ETH Zentrum, CH-8092, Zurich, Switzerland OR Juergen Schmidhuber www.idsia.ch IDSIA, Corso Elvezia 36, 6900-Lugano, Switzerland Applications (with WWW pointers to studies or papers, if available) can also be submitted electronically (in plain ASCII or postscript format) to petros at ifd.mavt.ethz.ch or juergen at idsia.ch Petros & Juergen From marshall at cs.unc.edu Fri Aug 21 12:25:25 1998 From: marshall at cs.unc.edu (Jonathan A. Marshall) Date: Fri, 21 Aug 1998 12:25:25 -0400 (EDT) Subject: New vision & pattern recognition papers available Message-ID: I would like to announce the availability of several new papers on vision, pattern recognition, and neural systems. These papers may be obtained from http://www.cs.unc.edu/~marshall --Jonathan A. Marshall marshall at computer.org Dept of Computer Science, Univ of North Carolina, Chapel Hill, NC, USA. Visionics Corp., Jersey City, NJ, USA. ---------------------------------------------------------------------------- Gupta VS, Alley RK, Marshall JA, "Development of triadic neural circuits for visual image stabilization under eye movements." Submitted for journal publication, July 1998. Human visual systems maintain a stable internal representation of a scene even though the image on the retina is constantly changing because of eye movements. Such stabilization can theoretically be effected by dynamic shifts in the receptive field (RF) of neurons in the visual system. This paper examines how a neural circuit can learn to generate such shifts. The shifts are controlled by eye position signals and compensate for the movement in the retinal image caused by eye movements. The development of a neural shifter circuit (Olshausen, Anderson, & Van Essen, 1992) is modeled using triadic connections. These connections are gated by signals that indicate the direction of gaze (eye position signals). In simulations, a neural model is exposed to sequences of stimuli paired with appropriate eye position signals. The initially nonspecific gating weights change, using a triadic learning rule. The pattern of gating develops so that different eye position signals selectively gate pathways from different positions within the visual field. Neurons then exhibit dynamic RF shifts, responding to the preferred stimulus within the RF and continuing to respond when the stimulus moves because of a shift in eye position. The triadic learning rule thus produces a shifter circuit that exhibits visual image stabilization. Traditional dyadic networks and learning rules do not produce such behavior. The self-organization capability of the model reduces the need for detailed pre-wiring or specific genetic programming of development. This shifter circuit model may also help in analyzing the behavior and formation of anticipatory RF shifts, which can reduce latency of visual response after eye movements, and attention-modulated changes in visual processing. ---------------------------------------------------------------------------- Marshall JA, Gupta VS, "Generalization and exclusive allocation of credit in unsupervised category learning." Network: Computation in Neural Systems 9:279-302, May 1998. A new way of measuring generalization in unsupervised learning is presented. The measure is based on an exclusive allocation, or credit assignment, criterion. In a classifier that satisfies the criterion, input patterns are parsed so that the credit for each input feature is assigned exclusively to one of multiple, possibly overlapping, output categories. Such a classifier achieves context-sensitive, global representations of pattern data. Two additional constraints, sequence masking and uncertainty multiplexing, are described; these can be used to refine the measure of generalization. The generalization performance of EXIN networks, winner-take-all competitive learning networks, linear decorrelator networks, and Nigrin's SONNET-2 network is compared. ---------------------------------------------------------------------------- Marshall JA, Schmitt CP, Kalarickal GJ, Alley RK, "Neural model of transfer-of-binding in visual relative motion perception." To appear in Computational Neuroscience: Trends in Research, 1998. January 1998. How can a visual system or cognitive system use the changing relationships between moving visual elements to decide which elements belong together as groups (or objects)? We have constructed a neural circuit model that selects object groupings based on global Gestalt common-fate evidence and uses information about the behavior of each group to predict the behavior of elements of the group. A simple competitive neural circuit binds elements into a representation of an object. Information about the spiking pattern of neurons allows transfer of the bindings of an object representation from location to location in the neural circuit as the object moves. The model exhibits characteristics of human object grouping and solves some key neural circuit design problems in visual relative motion perception. ---------------------------------------------------------------------------- Marshall JA, Srikanth V, "Curved trajectory prediction using a self-organizing neural network." Submitted for journal publication, September 1997. Existing neural network models are capable of tracking linear trajectories of moving visual objects. This paper describes an additional neural mechanism, disfacilitation, that enhances the ability of a visual system to track curved trajectories. The added mechanism combines information about an object's trajectory with information about changes in the object's trajectory, to improve the estimates for the object's next probable location. Computational simulations are presented that show how the neural mechanism can learn to track the speed of objects and how the network operates to predict the trajectories of accelerating and decelerating objects. ---------------------------------------------------------------------------- These five papers form part of Dr. George Kalarickal's recent dissertation: Kalarickal GJ, Theory of Cortical Plasticity in Vision. PhD Dissertation, Department of Computer Science, University of North Carolina at Chapel Hill, 1998. Kalarickal GJ, Marshall JA, "Comparison of generalized Hebbian rules for long-term synaptic plasticity." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "The role of afferent excitatory and lateral inhibitory synaptic plasticity in visual cortical ocular dominance plasticity." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "Plasticity in cortical neuron properties: Modeling the effects of an NMDA antagonist and a GABA agonist during visual deprivation." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "Models of receptive field dynamics in visual cortex." Submitted for journal publication, May 1998. Kalarickal GJ, Marshall JA, "Rearrangement of receptive field topography after intracortical and peripheral stimulation: The role of plasticity in inhibitory pathways." Submitted for journal publication, July 1998. A theory of postnatal activity-dependent neural plasticity based on synaptic weight modification is presented. Synaptic weight modifications are governed by simple variants of a Hebbian rule for excitatory pathways and an anti-Hebbian rule for inhibitory pathways. The dissertation focuses on modeling the following cortical phenomena: long-term potentiation and depression (LTP and LTD); dynamic receptive field changes during artificial scotoma conditioning in adult animals; adult cortical plasticity induced by bilateral retinal lesions, intracortical microstimulation (ICMS), and repetitive peripheral stimulation; changes in ocular dominance during "classical" rearing conditioning; and the effect of neuropharmacological manipulations on plasticity. Novel experiments are proposed to test the predictions of the proposed models, and the models are compared with other models of cortical properties. The models presented in the dissertation provide insights into the neural basis of perceptual learning. In perceptual learning, persistent changes in cortical neuronal receptive fields are produced by conditioning procedures that manipulate the activation of cortical neurons by repeated stimulation of localized regions. Thus, the analysis of synaptic plasticity rules for receptive field changes produced by conditioning procedures that activate small groups of neurons can also elucidate the neural basis of perceptual learning. Previous experimental and theoretical work on cortical plasticity focused mainly on afferent excitatory synaptic plasticity. The novel and unifying theme in this work is self-organization and the use of the lateral inhibitory synaptic plasticity rule. Many cortical properties, e.g., orientation selectivity, motion selectivity, spatial frequency selectivity, etc. are produced or strongly influenced by inhibitory interactions. Thus, changes in these properties could be produced by lateral inhibitory synaptic plasticity. ---------------------------------------------------------------------------- From arobert at cogsci.ucsd.edu Fri Aug 21 15:33:46 1998 From: arobert at cogsci.ucsd.edu (Adrian Robert) Date: Fri, 21 Aug 98 12:33:46 -0700 Subject: What have neural networks achieved? References: Message-ID: <199808211933.MAA12922@briah.ucsd.edu> After Dr. Arbib's request everyone seems to have been coming up with commercial applications (using NNs as statistical analyzers) but as far as his first question -- about generation of insight into real brain function -- zero. Lest this be taken as a sinister sign in yet another area of neural network research, I hurry to mention the one major example that I'm familiar with -- that of understanding the influence of environmental input on cortical neural representations. The work I'm talking about is of course that done, starting with von der Malsburg and others in the 70's, given a fresh impulse by Linsker in the 80's, and most thoroughly connected to the biology by Miller in the 90's, on the development of orientation selectivity (and also maps of orientation selectivity and ocular dominance columns) in primary visual cortex. While anyone in the field will tell you that the final word has yet to be said, this work genuinely provides insight -- it shows how the important elements of a class of biological neural systems can be translated into mathematical terms and how observed results emerge naturally from this translation. You leave an encounter with it feeling you have really understood something about the way things work -- and, although these methods have only been applied to the first visual area in the cortex (for the most part), they are general enough that they provide more than an inkling about what must be happening further in. (Long way to go though!) There are other examples... Adrian From jagota at cse.ucsc.edu Fri Aug 21 22:09:09 1998 From: jagota at cse.ucsc.edu (Arun Jagota) Date: Fri, 21 Aug 1998 19:09:09 -0700 Subject: Record/archive of debate? Message-ID: <199808220209.TAA18579@arapaho.cse.ucsc.edu> It would be nice if some sort of a record of the "Connectionist Symbol Processing" debate were to be produced and archived for the benefit of the community. With this in mind I have a specific proposal (which ties in with an unusual idea I have thought about on occasion). I invite interested individuals (especially those who contributed to this e-debate) to consider contributing a brief survey of their relevant work (positive or negative) on Connectionist Symbol Processing, a brief survey of some other group's relevant work that they are well-acquainted with, or a brief summary of their position on this topic. The aim is to collect these contributions into an informal, survey-type, "distributed article". Contributors whose contributions would be accepted would become its co-authors (hence the term distributed). Such an article, individual contribution acceptance and "article-wide review for improvement" processes yet to be determined, I would aim to have archived in Neural Computing Surveys. If you are seriously interested in contributing, inform me of your intent to contribute by e-mail (jagota at cse.ucsc.edu). Whether we proceed to the implementation phase will depend on the feedback I receive. I'd expect contributions to range from half a page to a page. Contributors to this planned article who contributed e-mail messages to this debate might well send polished versions of their e-mail messages (but clearly not those of others) as contributions. Also, plain text contributions will normally suffice. (My aim is to make it as easy as possible for you to contribute what specialized knowledge you have access to, towards this hopefully community-benefiting article. Having said that, there will be some sort of individual contribution review/acceptance threshold and subsequent article-wide review.) I will not be a author. For comments/questions, contact me directly, NOT the connectionists list. Arun Jagota jagota at cse.ucsc.edu From juergen at idsia.ch Sat Aug 22 06:18:50 1998 From: juergen at idsia.ch (Juergen Schmidhuber) Date: Sat, 22 Aug 1998 12:18:50 +0200 Subject: recent debate Message-ID: <199808221018.MAA04799@ruebe.idsia.ch> A side note on what Jon Baxter wrote: > In contrast, in a "learning to learn" setting where a learner is faced > with a (potentially infinite) *sequence* of learning tasks ... A more appropriate name for this is "inductive transfer." The traditional meaning of "learning to learn" is "learning learning algorithms." It refers to systems that search a space whose elements are credit assignment strategies, and is conceptually independent of whether or not there are different tasks. For instance, in contexts where there is only one task (such as receiving a lot of reward over time) the system may still be able to "learn to learn" by using experience for continually improving its learning algorithm (more on this in my home page). A note on what Bryan Thompson wrote: > If we consider that the primary mechanism of recurrence in a > distributed representations as enfolding space into time, I still have > reservations about the complexity that the agent / organism faces in > learning an enfolding of mechanisms sufficient to support symbolic > processing. There is a recurrent net method called "Long Short-Term Memory" (LSTM) which does not require "enfolding space into time". LSTM's learning algorithm is local in both space and time (unlike BPTT's and RTRL's). Despite its low computational complexity LSTM can learn algorithmic solutions to many "symbolic" and "subsymbolic" tasks (according to the somewhat vague distinctions that have been proposed) that BPTT/RTRL and other existing recurrent nets cannot learn: Sepp Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997 Juergen Schmidhuber, IDSIA www.idisia.ch/~juergen From tabor at CS.Cornell.EDU Sat Aug 22 15:04:49 1998 From: tabor at CS.Cornell.EDU (Whitney Tabor) Date: Sat, 22 Aug 1998 15:04:49 -0400 (EDT) Subject: Connectionist symbol processing---any progress? Message-ID: <199808221904.PAA04651@gibson.cs.cornell.edu> David Touretzky wrote: > I concluded that connectionist symbol processing had reached a > plateau, and further progress would have to await some revolutionary > new insight about representations. and mentioned Tony Plate's inspiring work on Holographic Reduced Representations (HRRs). I don't think that anybody has yet mentioned the related line of work which sets aside the learning problem for the moment and focuses on the geometry of the trajectories of metric (and vector) space computers, including many connectionist networks. One essential idea (proposed in embryonic form in Pollack, 1987, 1991, Siegelmann and Sonntag, 1991, Wiles and Elman, 1995, Rodriguez, Wiles, and Elman, to appear) is to use fractals to organize recursive computations in a bounded metric space. Cris Moore (Moore, 1996) provides the first substantial development of this idea, relating it to the traditional practice of classifying machines based on their computational power. He shows, for example, that every context free language can be recognized by some "dynamical recognizer" that moves around on an elaborated, one-dimensional Cantor Set. I have described a similar method which operates on high-dimensional Cantor sets and thus leads to an especially natural implementation in neural hardware (Tabor, 1998, submitted). This approach sheds some new light on the symbolic vs. metric space computation question by showing how we can use the structured entities recognized by traditional computational theory (e.g. particular context free grammars) as bearing points in navigating the larger set (Siegelman, 1996; Moore, 1996) of computing devices embodied in many analog computers. To my knowledge, no one has tried to use this kind of computational/geometric perspective to interpret HRRs and related outer product representations---I think this would be a very rewarding project. Whitney Tabor University of Connecticut http://www.cs.cornell.edu/home/tabor/tabor.html @UNPUBLISHED{Pollack:87, AUTHOR = {Jordan B. Pollack}, TITLE = {On Connectionist Models of Natural Language Processing}, NOTE = {Ph.D. Thesis, Department of Computer Science, University of Illinois}, YEAR = {1987}, } @ARTICLE{S&S:91, AUTHOR = {H. T. Siegelmann and E. D. Sontag}, TITLE = {Turing computability with neural nets}, JOURNAL = {Applied Mathematics Letters}, YEAR = {1991}, VOLUME = {4}, NUMBER = {6}, PAGES = {77-80}, } @ARTICLE{Pollack:91, AUTHOR = {Jordan B. Pollack}, TITLE = {The Induction of Dynamical Recognizers}, JOURNAL = {Machine Learning}, YEAR = {1991}, VOLUME = {7}, PAGES = {227-252}, } @INCOLLECTION{W&E:95, AUTHOR = {Janet Wiles and Jeff Elman}, TITLE = {Landscapes in Recurrent Networks}, BOOKTITLE = {Proceedings of the 17th Annual Cognitive Science Conference}, EDITOR = {Johanna D. Moore and Jill Fain Lehman}, PUBLISHER = {Lawrence Erlbaum Associates}, YEAR = {1995}, } @ARTICLE{R&W&E:ta, AUTHOR = {Paul Rodriguez and Janet Wiles and Jeffrey Elman}, TITLE = {How a Recurrent Neural Network Learns to Count}, JOURNAL = {Connection Science}, YEAR = {ta}, VOLUME = {}, NUMBER = {}, PAGES = {}, } @UNPUBLISHED{Moore:96b, AUTHOR = {Christopher Moore}, TITLE = {Dynamical Recognizers: Real-time Language Recognition by Analog Computers}, NOTE = {TR No. 96-05-023, Santa Fe Institute}, YEAR = {1996}, } @ARTICLE{Siegelmann:96, AUTHOR = {Hava Siegelmann}, TITLE = {The simple dynamics of super {T}uring theories}, JOURNAL = {Theoretical Computer Science}, YEAR = {1996}, VOLUME = {168}, PAGES = {461-472}, } @UNPUBLISHED{Tabor:98, AUTHOR = {Whitney Tabor}, TITLE = {Dynamical Automata}, NOTE = {47 pages. Tech Report \# 98-1694. Department of Computer Science, Cornell University. Download from http://cs-tr.cs.cornell.edu/}, YEAR = {1998}, } @UNPUBLISHED{Tabor:subb, AUTHOR = {Whitney Tabor}, TITLE = {Context Free Grammar Representation in Neural Networks}, NOTE = {7 pages. Draft version available at http://simon.cs.cornell.edu/home/tabor/papers.html}, YEAR = {submitted to NIPS}, } From arbib at pollux.usc.edu Mon Aug 24 02:01:18 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Sun, 23 Aug 1998 22:01:18 -0800 Subject: What have neural networks achieved? Message-ID: >From: Adrian Robert >Date: Fri, 21 Aug 98 12:33:46 -0700 > >... everyone seems to have been coming up with >commercial applications (using NNs as statistical analyzers) but as far as >his >first question -- about generation of insight into real brain function -- >zero. My thanks to Adrian for reminding you of this question! For example, I think that Houk and Barto, Kawato, and my group (Schweighofer and Spoelstra) have begun to make real progress in elucidating the role of cerebellum in motor control. So: I would like to see responses of the form: "Models A and B have shown the role of brain regions C and D in functions E and F - see specific references G and H". The real interest comes when claims appear to conflict: Can we unify theories on the roles of cerebellum in both motor control and classical conditioning? What about the role of hippocampus in both spatial navigation and consolidation of short term memory? Thanks again, Adrian Robert! ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From bmg at cs.rmit.edu.au Sun Aug 23 10:49:54 1998 From: bmg at cs.rmit.edu.au (B Garner) Date: Mon, 24 Aug 1998 00:49:54 +1000 (EST) Subject: Record/archive of debate? Message-ID: <199808231449.AAA02154@numbat.cs.rmit.edu.au> * * It would be nice if some sort of a record of the "Connectionist * Symbol Processing" debate were to be produced and archived for * the benefit of the community. * I think this would be a good idea.. because there were so many interesting ideas expressed. I recently published 2 training algorithms which I call symbolic they are found at http://yallara.cs.rmit.edu.au/~bmg/algA.ps and algB.ps I have included the abstracts below. Although I say these algorithms are symbolic because they train the networks without finding a numerical solution, because sets of constraints are derived during training. These constraints show that the weights and the thresholds are all in relationship to each other at each 'neuron'. I have thought a lot about what 'symbol' means, and I have decided, largely, that 'symbol is something that takes its meaning from those symbols "around" it'. Perhaps there are better definitions because this one is self-referential. But this idea of symbol is close to, apparently, structural linguistics. Conveniently, perhaps you might think, this idea supports the results of my training algorithms. I have read some of the argument in this debate where someone said that the topology of the input space needs to be examined. With my second algorithm, once you know the topology of the input space the problem can be transformed and learnt very simply. Even problems such as the twin spiral problem can be learnt with one hidden layer. These algorithms are very simple, but I haven't finished writing up all my results I have yet. Here are the abstracts: A symbolic solution for adaptive feedforward neural networks found with a new training algorithm B. M. Garner, Department of Computer Science, RMIT, Melbourne, Australia. ABSTRACT Traditional adaptive feed forward neural network (NN) training algorithms find numerical values for the weights and thresholds. In this paper it is shown that a NN composed of linear threshold gates (LTGs) can function as a fully trained neural network without finding numerical values for the weights and thresholds. This surprising result is demonstrated by presenting a new training algorithm for this type of NN that resolves the network into constraints which describes all the numeric values the NN's weights and thresholds can take. The constraints do not require a numerical solution for the network to function as a fully trained NN which can generalize. The solution is said to be symbolic as a numerical solution is not required. ************************************************************************** A training algorithm for Adaptive Feedforward Neural Networks that determines its own topology B. M. Garner, Department of Computer Science, RMIT, Melbourne, Australia. ABSTRACT There has been some interest in developing neural network training algorithms that determine their own architecture. A training algorithm for adaptive feedforward neural networks (NN) composed of Linear Threshold Gates (LTGs) is presented here that determines it's own architecture and trains in a single pass. This algorithm produces what is said to be a symbolic solution as it resolves the relationships between the weights and the thresholds into constraints which do not require to be solved numerically. The network has been shown to behave as a fully trained neural network which generalizes and the possibility that the algorithm has polynomial time complexity is discussed. The algorithm uses binary data during training. From sml at essex.ac.uk Mon Aug 24 03:49:17 1998 From: sml at essex.ac.uk (Simon Lucas) Date: Mon, 24 Aug 1998 08:49:17 +0100 Subject: Connectionist symbol processing Message-ID: <35E11AFD.5232DE9A@essex.ac.uk> I would suggest that most recurrent neural net architectures are not fundamentally more 'neural' than hidden Markov models - think of an HMM as a neural net with second-order weights and linear activation functions. HMMs are, of course, very much alive and kicking, and routinely successfully applied to problems in speech and OCR for example. It might be argued that the HMMs tend to employ less distributed representations than RNNs, but even if this is true, so what? Some interesting work that has explored links between the two: @ARTICLE{Bridle-alpha-net, AUTHOR = "Bridle, J.S.", TITLE = "Alpha-nets: a recurrent ``neural'' network architecture with a hidden Markov model interpretation", JOURNAL = "Speech Communication", YEAR = "(1990)", VOLUME = 9, PAGES = "83 -- 92"} @ARTICLE{Bengio-iohmm, AUTHOR = "Bengio, Y and Frasconi, P", TITLE = "Input-output HMMs for sequence processing", JOURNAL = "IEEE Transactions on Neural Networks", YEAR = "(1996)", VOLUME = 7, PAGES = "1231 -- 1249"} Also related to the discussion is the Syntactic Neural Network (SNN) - an architecture I developed in my PhD thesis (refs below). The SNN is a modular architecture that is able to parse and (in some cases) infer context-free (and therefore also regular, linear etc) grammars. The architecture is composed of Local Inference Machines (LIMs) that rewrite pairs of symbols. These are then arranged in a matrix parser formation (see Younger1967) to handle general context-free grammars - or we can alter the SNN macro-structure in order to specifically deal with simpler classes of grammar such as regular, strictly-hierarchical or linear. The LIM remains unchanged. In my thesis I only developed a local learning rule for the strictly-hierarchical grammar, which was a specialisation of the Inside/Outside algorithm for training stochastic context-free grammars. By constructing the LIMs from forward-backward modules (see Lucas-fb) however, any SNN that you construct automatically has an associated training algorithm. I've already proven this to work for regular grammars, I'm now in the process of testing some other cases - I'll post the paper to this group when its done. refs: @ARTICLE{Younger1967, AUTHOR = "Younger, D.H.", TITLE = "Recognition and parsing of context-free languages in time $n^{3}$", JOURNAL = "Information and Control", VOLUME = 10, NUMBER = 2, PAGES = "189 -- 208", YEAR = "(1967)"} @ARTICLE{Lucas-snn1, AUTHOR = "Lucas, S.M. and Damper, R.I.", TITLE = "Syntactic neural networks", JOURNAL = "Connection Science", YEAR = "(1990)", VOLUME = "2", PAGES = "199 -- 225"} @ARTICLE{Lucas-phd, AUTHOR = "Lucas, S.M.", TITLE = "Connectionist Architectures for Syntactic Pattern Recognition", JOURNAL = "PhD Thesis, University of Southampton", YEAR = "(1991)"} ftp://tarifa.essex.ac.uk/images/sml/reports/fbnet.ps @INCOLLECTION{Lucas-fb, AUTHOR = "Lucas, S.M.", TITLE = "Forward-backward building blocks for evolving neural networks with intrinsic learning behaviours", BOOKTITLE = "Lecture Notes in Computer Science (1240): Biological and artificial computation: from neuroscience to technology", YEAR = "(1997)", PUBLISHER = "Springer-Verlag", PAGES = "723 -- 732", ADDRESS = "Berlin"} ------------------------------------------------ Simon Lucas Department of Electronic Systems Engineering University of Essex Colchester CO4 3SQ United Kingdom Tel: (+44) 1206 872935 Fax: (+44) 1206 872900 Email: sml at essex.ac.uk http://esewww.essex.ac.uk/~sml secretary: Mrs Wendy Ryder (+44) 1206 872437 ------------------------------------------------- From michael.j.healy at boeing.com Mon Aug 24 14:25:57 1998 From: michael.j.healy at boeing.com (Michael J. Healy 425-865-3123) Date: Mon, 24 Aug 1998 11:25:57 -0700 Subject: Connectionist symbolic processing Message-ID: <199808241825.LAA15169@lilith.network-b> I've been doing research in connectionist symbol processing for some time, so I'd like to contribute something to the discussion. I'll try to keep it brief and just say what I'm ready to say. I am not prepared to address Michael Arbib's question about real brain function at this time, although it's possible to make a connection. First, here are some references to the literature of rule extraction with neural networks, which I have been following. The list omits a lot of good work, but is meant to be representative: Andrews, R., Diederich, J. & Tickle, A. B. (1995) "Survey and critique of techniques for extracting rules from trained artificial neural networks", Knowledge-Based Systems, vol. 8, no. 6, 373-389. Craven, M. W. & Shavlik, J. W. (1993) "Learning Symbolic Rules using Artificial Neural Networks", Proceedings of the 10th International Machine Learning Conference, Amherst, MA. 73-80. San Mateo, CA:Morgan Kaufmann. Healy, M. J. & Caudell, T. P. (1997) "Acquiring Rule Sets as a Product of Learning in a Logical Neural Architecture", IEEE Transactions on Neural Networks, vol. 8, no. 3, 461-475. Kasabov, N. K. (1996) "Adaptable neuro production systems", Neurocomputing, vol. 13, 95-117. Setiono, R. (1997) "Extracting Rules from Neural Networks by Pruning and Hidden-Unit Splitting", Neural Computation, vol. 9, no. 1, 205-225. Sima, J. (1995) "Neural Expert Systems", Neural Networks, vol. 8, 261-271. Most of the work is empirical, but is accompanied by analyses of the practical aspects of extracting knowledge from data and of incorporating pre-existing knowledge along with the extracted knowledge. The supposed knowledge here is mostly in the form of if-then rules which, to greater or lesser extent, represent propositional statements. There is also some recent work on mathematically formalizing connectionist symbolic computations, for example: Pinkas, G. (1995) "Reasoning, nonmonotonicity and learning in connectionist networks that capture propositional knowledge", Artificial Intelligence 77, 203-247. I've been developing a formal semantic model for neural networks--- a mathematical model of concept representation in connectionist memories and learning by connectionist systems. I've found that such a model requires an explicit semantic model, in which the "universe" of things the concepts are about receives as much attention in the mathematical model as the concepts themselves. I think this is essential for resolving the ambiguities that crop up in discussions about symbolic processing and neural networks. For example, it allows me to make some statements about issues brought up in the discussion of connectionist symbol processing. Whether you agree with me or not, I'd certainly be interested in further discussion. I've been concentrating on geometric logic and its model theory (different sense of the word "model"), mostly (so far) in the form of point-set topology. The set-theoretic form is the simple version of the semantics of geometric logic. It's really a categorical logic, so the full semantic model requires category theory. Geometric logic is very strict in what it takes to assert a statement. It is meant to represent observational statements, ones whose positive instances can be observed. Topology is commonly studied in its point-set version, but the categorical form is better for formal semantics. Having said that, I'll stick with sets in the following. Also, I'll refer to the models of a theory as its instances. My main finding to date is that a sound and complete rule base--- one in which the rules are actually valid for all the data and which has all the rules---has the semantics of a continuous function between the right topological spaces. This requires some explaining, not only the "all the rules" and "right topological spaces" business, but also the statement about continuous functions. For most, continuity means continuous functions on the real or complex numbers, or on vector spaces over same. But those are a special case: the topologies and continuous functions I work with also involve spaces normally represented by discrete- valued variables. Continuity is really the mathematical way of saying "similar things map to similar things". My first publication on this has some details (a more extensive treatment is to appear): M. J. Healy, Continuous Functions and Neural Network Semantics, Proc. of Second World Cong. of Nonlinear Analysts (WCNA96), Athens. In Nonlinear Analysis, Volume 30, issue #3, 1997. pp. 1335-1341 In geometric logic, a continuous function is two functions---a mapping from the instances (worlds, states, models) of theory A to the instances of theory B, and an associated mapping from the formulas of theory B to those of theory A. Without going into too much detail, the topological connection is that a set of things that satisfy a formula (instances of the formula) form an open set in a particular topological space. In the applications we often deal with, the training examples for a neural network are instances of a theory of the domain of application. A formula in the theory expresses a property of or a relation between instances. The instances are called "points" of the space, and the corresponding open set contains the points. Finite conjunctions of formulas correspond to the finite intersections of open sets, and we allow arbitrary disjunctions, corresponding to the unions (arbitrary disjunctions are appropriate for observations). There is a little more to it, because instead of the usual set unions we use unions over directed sets of subsets. A valid and complete rule base can be refined to have the form of the formula mapping half of a continuous function from space A (theory A and its instances, with the induced topology) to space B (as a special case, the two spaces can be the same, or can have the same points). Correspondingly, the open set for the antecedent of each refined rule is the inverse image under the point mapping of the open set for its consequent. The refinement is obtained by forming the disjunction of all antecedents with the same consequent. The points mapping of the continuous function expresses the fact that every instance of the antecedent of a rule must map to an instance of the consequent of the rule, where the rule expresses truth-preservation. This mathematical model relates directly to the work being done in rule extraction, even with the many different approaches and neural network models in use. Furthermore, I think it supports intuition, but I'd like you to be the judge. One thing I'd like to add is that the topological model is consistent with probabilistic modeling and fuzzy logic. The focus of this model really is upon semantics (or semiotics, if this is regarded as a model of sign-meaning relationships; I am mostly interested in the semantics). Finally, I'd like to comment upon an important issue that has appeared in this thread---how important is the input space topology (metric, structure, theory, ... )? I apologize if I've misiniterpreted any of what's been said, but here's my two cents. I don't think there is always a single "right" topological space. The form of the data and how you handle it depends on what assumptions were made in working up the data for presentation as training (or testing) examples for the neural network. Formalizing, I would say that the assumptions yield a theory about the domain of inputs, and this in turn yields a topology. The topology does not have to be induced by a metric, not unless you make the assumption that distances between data points (in the metric sense) are valid. For example, if you have applied a Euclidean clustering algorithm, you have implicitly made the assumption that the Euclidean metric is the semantics of the application items that are being encoded as data items. What you get will be partly a result of that assumption. But what you get also depends upon the assumptions underlying the algorithm. If the algorithm is really coming up with anything new, it will impose a new topology upon the data. For example, a Euclidean clustering algorithm doesn't return all the open balls of the Euclidean-induced topology---it returns a finite set of class representations. However, you'd like the final result to have some connection with your original interpretation of the data, since after all that was your way of seeing the application. So, it would be nice to have continuity, meaning that every instance of the input domain theory maps to an instance of the output theory in a manner consistent with the formulas (open sets) in both theories (topologies). An advantage of the continuous function model here is that it tells me what I need to do: Modify the topologies (hence the theories) so that the inverse of an open set is open. Of course, that's only a mathematical abstraction, and the question is still So what do you do? Well, I don't think you want to discard the input topology outright, for the reason I gave: It is the theory that gave you your data. But you can modify it if need be. If you assumed a metric and your final classification result (assuming you were doing classification or clustering) has an output domain consisting of metric-induced open sets, you need do nothing. You can get more information by going to a more sophisticated pair of spaces by an embedding, but at least your algorithm gave you classes that projected back into the input topology, so you're OK there. However, for many data and machine models, the input space (or the output space or both) won't accept the projections gracefully, so you need to do something. One thing you can do is suppose you have the wrong learning algorithm and try to find one that will automatically yield continuity without changing the input space. Another thing you can do is suppose that the algorithm is telling you something about the input data space, and modify the topology as needed to accept the new open sets (extend the sub-base of the topology). See how your application looks now! How you proceed from here depends upon what kinds of properties you want to study. What I'm proposing is that the topological model is good as a guide for further work because of its mathematical precision in semantic modeling. Regards, Mike Healy -- =========================================================================== e Michael J. Healy A FA ----------> GA (425)865-3123 | | FAX(425)865-2964 | | Ff | | Gf c/o The Boeing Company | | PO Box 3707 MS 7L-66 \|/ \|/ Seattle, WA 98124-2207 ' ' USA FB ----------> GB -or for priority mail- e "I'm a natural man." 2760 160th Ave SE MS 7L-66 B Bellevue, WA 98008 USA michael.j.healy at boeing.com -or- mjhealy at u.washington.edu ============================================================================ From adr at nsma.arizona.edu Mon Aug 24 17:35:10 1998 From: adr at nsma.arizona.edu (David Redish) Date: Mon, 24 Aug 1998 14:35:10 -0700 Subject: What have neural networks achieved? In-Reply-To: Your message of "Sun, 23 Aug 1998 22:01:18 PST." Message-ID: <199808242135.OAA20708@cortex.NSMA.Arizona.EDU> Michael Arbib wrote: >So: I would like to see responses of the form: >"Models A and B have shown the role of brain regions C and D in functions E >and F - see specific references G and H". >The real interest comes when claims appear to conflict: >Can we unify theories on the roles of cerebellum in both motor control and >classical conditioning? >What about the role of hippocampus in both spatial navigation and >consolidation of short term memory? In terms of the role of the hippocampus, a number of conflicting hypotheses have recently been shown not to be incompatible through computational modeling. The two major theories that have been argued over for the last twenty-plus years are (1) that the hippocampus forms a cognitive map for navigation (e.g. O'Keefe and Nadel, 1978) and (2) that the hippocampus stores episodic memories temporarily and replays them for consolidation into cortex (e.g. Cohen and Eichenbaum, 1993). We (David Touretzky and I, see Touretzky and Redish, 1996; Redish and Touretzky 1997, Redish 1997) examined the role of the hippocampus in the navigation domain by looking at the whole rodent navigation system (thereby attempting to put the role of the hippocampus in an anatomical context of a greater functional system). By looking at computational complexities and extensive simulations, we determined that the most likely role of the hippocampus in navigation is to allow an animal to reset an internal coordinate system on re-entry into an environment (i.e. to *self-localize* on returning to an environment). (From this theory we predicted that hippocampal lesions should not affect the ability of animals to wander out and return to a starting point, an ability called path integration which had previously been hypothesized to be hippocampally-dependent. This prediction has been borne out by recent experiments, Alyan and McNaughton, 1997). It is straight-forward to extend this idea of self-localization to a "return to context" which explains a large literature of primate data (Redish, 1999). In addition to the self-localization role, the hippocampus has been shown to replay recently traveled routes during sleep (Skaggs and McNaughton, 1996). However, the mechanisms that have been proposed to accomplish these two functions require incompatible connection matrices. Self-localization requires a symmetric component and route-replay requires an asymmetric component. We showed (Redish and Touretzky, 1998) that with the incorporation of external inputs representing spatial cues during self-localization (obviously necessary for accurate self-localization), self-localization can be accurate even with a weak asymmetric component, and that the weak asymmetric component is sufficient to replay the recently traveled routes (without the external input, which would presumably not be present during sleep). This shows that the two roles hypothesized for hippocampus are not incompatible. REFERENCES S. H. Alyan and B. M. Paul and E. Ellsworth and R. D. White and B. L. McNaughton (1997) Is the hippocampus required for path integration? Society for Neuroscience Abstracts. 23:504. N. J. Cohen and H. Eichenbaum (1993) Memory, Amnesia, and the Hippocampal System, MIT Press, Cambridge, MA. J. O'Keefe and L. Nadel (1978) The Hippocampus as a Cognitive Map, Clarendon Press, Oxford. A. D. Redish and D. S. Touretzky (1997) Cognitive Maps Beyond the Hippocampus, Hippocampus, 7(1):15-35. A. D. Redish (1997) Beyond the Cognitive Map: Contributions to a Computational Neuroscience Theory of Rodent Navigation, PhD Thesis. Carnegie Mellon University, Pittsburgh PA. A. D. Redish and D. S. Touretzky (1998) The role of the hippocampus in solving the {M}orris Water Maze, Neural Computation, 10(1):73-111. A. D. Redish (in press) Beyond the Cognitive Map: From Place Cells to Episodic Memory, MIT Press, Cambridge MA. W. E. Skaggs and B. L. McNaughton (1996) Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience, Science, 271:1870-1873. D. S. Touretzky and A. D. Redish (1996) A theory of rodent navigation based on interacting representations of space, Hippocampus, 6(3):247-270. ----------------------------------------------------- A. David Redish adr at nsma.arizona.edu Post-doc http://www.cs.cmu.edu/~dredish Neural Systems, Memory and Aging, Univ of AZ, Tucson AZ ----------------------------------------------------- From negishi at cns.bu.edu Tue Aug 25 05:11:25 1998 From: negishi at cns.bu.edu (Michiro Negishi) Date: Tue, 25 Aug 1998 05:11:25 -0400 Subject: Connectionist symbolic processing In-Reply-To: <199808241825.LAA15169@lilith.network-b> (michael.j.healy@boeing.com) Message-ID: <199808250911.FAA09262@music.bu.edu> Here are my 5 cents, from the self-organizing camp. On Mon, 10 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > The problem, though, was that we > did not have good techniques for dealing with structured information > in distributed form, or for doing tasks that require variable binding. > While it is possible to do these things with a connectionist network, > the result is a complex kludge that, at best, sort of works for small > problems, but offers no distinct advantages over a purely symbolic > implementation. As many have already argued, at least empirically I don't feel that the issue of structured data representation as *the main* obstacle in constructing a model of symbolic processing, although it is an interesting and challenging subject. In my neural model of syntactic analysis and thematic role assignment for instance, I use the following neural fields for representing a word/phrase. (1) A field for representing the word or the head of the phrase. (there is a computational algorithm for determining the head of a phrase) (2) Fields for representing the features of the word/phrase as well as its children in the syntactic tree (or the semantic structure). Features are obtained by the PCA over the context in which the word/which appears. (3) Associator fields for retrieving children and the parent. In plain words, (1) is the lexical information, (2) is the featural information, and (3) is the associative pointer. The resultant representation is similar to RAAM. A key point in the feature extraction in the model is that once the parser begins to combine words into phrases, it begins to collect distributions in terms of the heads of the phrases, which in turn is used in the PCA. The model was trained using a corpus that contains mothers' input to the children (a part of the CHILDES corpus), so it's not a "toy" model, although it's not as good as being able to cope with the Wall Street Journal yet, (I have to crawl before I walk :) which was expectable considering the very strict learning conditions of the model: no initial lexical or syntactic knowledge, no external corrective signals from a teacher. I think it's a virtue rather than a defect that this type of representation does not represent all concepts at a time. In many languages, each word represents only very limited number of concepts at most, although it can also convey many features of itself and its children (eg. in many languages, agreement morphemes attached to a verb encode gender, person, etc. of the subject and objects). Also there are island effects, which shows that production of a clause can have access only to the concept itself and its direct children (and not internal structure below each child). I think that the real challenge is to do a cognitively plausible modeling that sheds a new light to the understanding of language and cognition. That is why I constrain myself to self-organizing networks. As for future direction I agree with Whitney Tabor that application of the fractal theory may be a promising direction. I would be interested to know if some one tried to interpret HPSG or more classical X-bar theory as fractals. Here are some refs on self-organizing models of language (except for the famous ones by Miikkulainen). This line of research is alive, and will kick soon. Ritter, H. and Kohonen, T. (1990). Learning semantotopic maps from context. Proceedings of IJCNN 90, Washington D.C., I. Sholtes, J. C. (1991). Unsupervised context learning in natural language processing. In Proc. IJCNN Seattle 1991. M. Negishi (1995) Grammar learning by a self-organizing network. In Advances in Neural Information Processing Systems 7, 27-35. MIT Press. My unpublished thesis work is accessible from http://cns-web.bu.edu/pub/mnx/negishi.html ----------------------------------------------------- Michiro Negishi ----------------------------------------------------- Dept. of Cognitive & Neural Systems, Boston Univ. 677 Beacon St., Boston, MA 02215 Email: negishi at cns.bu.edu Tel: (617) 353-6741 ----------------------------------------------------- From kdh at anatomy.ucl.ac.uk Tue Aug 25 07:57:51 1998 From: kdh at anatomy.ucl.ac.uk (Ken Harris) Date: Tue, 25 Aug 1998 12:57:51 +0100 Subject: Neural networks and brain function Message-ID: <199808251157.MAA01484@ylem.anat.ucl.ac.uk> I'd like to add something to the debate about neural network modelling and brain function, in particular concerning the resolution of apparently conflicting models. It seems to me that the main contribution of neural networks to this question has been a change of emphasis. Before neural networks were common currency, a neurological model usually consisted of a statement that a particular brain structure was necessary for a particular type of task. For example: "The cerebellum is necessary for motor control" "The hippocampus is necessary for spatial function" "The hippocampus is necessary for episodic memory" After neural networks, we have a different set of analogies. We now make neurological models that ascribe a particular computational function to a brain structure. For example: "The cerebellum performs supervised learning" "The hippocampus functions as an autoassociative memory" By talking about a computational function, rather than a type of task that a brain structure is needed for, a lot of apparent conflict can suddenly be resolved. In the example of the cerebellum, the evidence that the cerebellum is involved in motor control and classical conditioning, and even higher cognitive functions does not seem so contradictory. It is very plausible that a supervised learning network would be useful for all of these functions -- see for example the work of Kawato and Thompson. In the example of the hippocampus, work by Michael Recce and myself has shown how an autoassociative memory can play a role in both episodic memory and spatial function, in particular giving an animal localisation ability by performing pattern completion on partial egocentric maps. For those who might be interested: Recce, M. and Harris, K.D. "Memory for places: A navigational model in support of Marr's theory of hippocampal function" Hippocampus, vol 6, pp. 735-748 (1996) http://www.ncrl.njit.edu/papers/hpc_model.ps.gz ----------------------------------------------- Ken Harris Department of Anatomy and Developmental Biology University College London http://www.anat.ucl.ac.uk/~kdh From ingber at ingber.com Tue Aug 25 09:29:18 1998 From: ingber at ingber.com (Lester Ingber) Date: Tue, 25 Aug 1998 08:29:18 -0500 Subject: Paper: A simple options training model Message-ID: <19980825082918.A29949@ingber.com> The paper markets98_spread.ps.Z [40K] is available at my InterNet archive: %A L. Ingber %T A simple options training model %R LIR-98-2-SOTM %I Lester Ingber Research %C Chicago, IL %D 1998 %O URL http://www.ingber.com/markets98_spread.ps.Z Options pricing can be based on sophisticated stochastic differential equation models. However, many traders, expert in their art of trading, develop their skills and intuitions based on loose analogies to such models and on games designed to tune their trading skills, not unlike the state of affairs in many disciplines. An analysis of one such game reveals some simple but relevant probabilistic insights into the nature of options trading often not discussed in most texts. ======================================================================== Instructions for Retrieval of Code and Reprints Interactively Via WWW The archive can be accessed via WWW path http://www.ingber.com/ http://www.alumni.caltech.edu/~ingber/ where the last address is a mirror homepage for the full archive. Interactively Via Anonymous FTP Code and reprints can be retrieved via anonymous ftp from ftp.ingber.com. Interactively [brackets signify machine prompts]: [your_machine%] ftp ftp.ingber.com [Name (...):] anonymous [Password:] your_e-mail_address [ftp>] binary [ftp>] ls [ftp>] get file_of_interest [ftp>] quit The 00index file contains an index of the other files. Files have the same WWW and FTP paths under the main / directory; e.g., http://www.ingber.com/MISC.DIR/00index_misc and ftp://ftp.ingber.com/MISC.DIR/00index_misc reference the same file. Electronic Mail If you do not have WWW or FTP access, get the Guide to Offline Internet Access, returned by sending an e-mail to mail-server at rtfm.mit.edu with only the words send usenet/news.answers/internet-services/access-via-email in the body of the message. The guide gives information on using e-mail to access just about all InterNet information and documents. Additional Information Limited help assisting people with queries on my codes and papers is available only by electronic mail correspondence. Sorry, I cannot mail out hardcopies of code or papers. Lester ======================================================================== -- /* Lester Ingber Lester Ingber Research * * PO Box 06440 Wacker Dr PO Sears Tower Chicago, IL 60606-0440 * * http://www.ingber.com/ ingber at ingber.com ingber at alumni.caltech.edu */ From oreilly at grey.colorado.edu Tue Aug 25 11:54:10 1998 From: oreilly at grey.colorado.edu (Randall C. O'Reilly) Date: Tue, 25 Aug 1998 09:54:10 -0600 Subject: What have neural networks achieved? Message-ID: <199808251554.JAA15620@grey.colorado.edu> Another angle on the hippocampal story has to do with the phenomenon of catestrophic interference (McCloskey & Cohen, 1989), and the notion that the hippocampus and the cortex are complementary learning systems that each optimize different functional objectives (McClelland, McNaughton, & O'Reilly, 1995). In this case, the neural network approach provides a principled basis for understanding why we have a hippocampus, and what its functional characteristics should be. Interestingly, one of the "sucesses" of neural networks in this case was their dramatic failure in the form of the catestrophic interference phenomenon. This failure tells us something about the limitations of the cortical memory system, and thus, why we might need a hippocampus. - Randy @incollection{McCloskeyCohen89, author = {McCloskey, M. and Cohen, N. J.}, editor = {G. H. Bower}, title = {Catastrophic interference in connectionist networks: {The} sequential learning problem}, booktitle = {The Psychology of Learning and Motivation}, pages = {109-165}, year = 1989, publisher = {Academic Press}, address = {New York}, volume = 24 } @article{McClellandMcNaughtonOReilly95, author = {McClelland, J. L. and McNaughton, B. L. and O'Reilly, R. C.}, title = {Why There are Complementary Learning Systems in the Hippocampus and Neocortex: Insights from the Successes and Failures of Connectionst Models of Learning and Memory}, journal = {Psychological Review}, pages = {419-457}, year = {1995}, volume = {102} } This article has lots of references to the relevant neural network literature. A TR version is available from the following 2 ftp sites: ftp://cnbc.cmu.edu:/pub/pdp.cns/pdp.cns.94.1.ps.Z ftp://grey.colorado.edu/pub/oreilly/tr/pdp.cns.94.1.ps.Z +-----------------------------------------------------------------------------+ | Dr. Randall C. O'Reilly | | | Assistant Professor | | | Department of Psychology | Phone: (303) 492-0054 | | University of Colorado Boulder | Fax: (303) 492-2967 | | Muenzinger D251C | Home: (303) 448-1810 | | Campus Box 345 | email: oreilly at psych.colorado.edu | | Boulder, CO 80309-0345 | www: http://psych.colorado.edu/~oreilly | +-----------------------------------------------------------------------------+ From lazzaro at CS.Berkeley.EDU Tue Aug 25 12:27:58 1998 From: lazzaro at CS.Berkeley.EDU (John Lazzaro) Date: Tue, 25 Aug 1998 09:27:58 -0700 (PDT) Subject: Connectionist symbol processing Message-ID: <199808251627.JAA07316@snap.CS.Berkeley.EDU> > I would suggest that most recurrent neural net architectures > are not fundamentally more 'neural' than hidden Markov models - > think of an HMM as a neural net with second-order weights > and linear activation functions. We presented a continuous-time analog-circuit implementation of a HMM state decoder a few years ago at NIPS -- I've always felt that if you can make a clock-free analog system compute an algorithm well in silicon, its reasonable to expect it can be implemented usefully in biological neurons as well ... Lazzaro, J. P., Wawrzynek J., and Lippmann, R. (1996). A micropower analog VLSI HMM state decoder for wordspotting. In Jordan, M., Mozer, M., and Petsche, T. (eds), {\it Advances in Neural Information Processing Systems 9}. Cambridge, MA: MIT Press. Lazzaro, J., Wawrzynek, J., Lippmann, R. P. (1997). A micropower analog circuit implementation of hidden markov model state decoding. {\it IEEE Journal Solid State Circuits} {\bf 32}:8, 1200--1209. http://www.cs.berkeley.edu/~lazzaro/biblio/decoder.ps.gz --john lazzaro From jlm at cnbc.cmu.edu Tue Aug 25 14:26:33 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Tue, 25 Aug 1998 14:26:33 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808251826.OAA27165@CNBC.CMU.EDU> I haven't been able to read all of the email on connectionists lately and so it is possible that the following is redundant, but it seems to me there is a real success story here. There has been a great deal of connectionist work on the processing of regular and exceptional material, initiated by the Rumelhart-McClelland paper on the past tense. Debate has raged on the subject of the past tense and work there is ongoing, but I won't claim a success story there at this time. What I would like to point to instead is the related topic of single word reading. Sejnowski and Rosenberg's NETTALK first extended connectionist ideas to this issue, and Seidenberg and McClelland went on to show that a connectionist model could account in great detail for the pattern of reaction times found in around 30 studies concerning the effects of regularity, frequency, and lexical neighbors on reading words aloud. This was followed by a resounding critique along the lines of Pinker and Prince's critique of R&M, coming this time from Derrick Besner (and colleagues) and Max Coltheart (and colleagues). Both pointed to the fact that the S&M model didn't do a very good job of reading nonwords, and both claimed that this reflected an in-principal limitation of a connectionist, single mechanism account: To do a good job with both, it was claimed, a dual route system was required. The success story is a paper by Plaut, McClelland, Seidenberg, and Patterson, in which it was shown in fact that a single mechanism, connectionist model can indeed account for human performance in reading both words and nonwords. The model replicated all the S&M findings, and at the same time was able to read non-words as well as human subjects, showing the same types of neighbor-driven responses that human readers show (eg MAVE is sometimes read to rhyme with HAVE instead of SAVE). Of course there are still some loose ends but it is no longer possible to claim that a single-mechanism account cannot capture the basic pattern of word and non-word reading data. The authors of PMSP all believe, I think, that there are semantic as well as phonological sources of influence on word reading, so that the system is, to an extent, a kind of dual-route system. This was in fact articulated in the earlier, SM formulation. This can lead to apparent dissociations in fMRI and effects of brain damage on reading, but the dissociation is fundamentally one of semantic vs phonological processes rather than lexical vs rule-guided processes. For example the phonological system, while sensitive to regularities, nevertheless captures knowledge of specific high-frequency exceptions. -- Jay McClelland From mozer at cs.colorado.edu Tue Aug 25 12:25:29 1998 From: mozer at cs.colorado.edu (Mike Mozer) Date: Tue, 25 Aug 98 10:25:29 -0600 Subject: What have neural networks achieved? In-Reply-To: Your message of Fri, 14 Aug 98 14:07:20 -0800. Message-ID: <199808251625.KAA17844@neuron.cs.colorado.edu> > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? I've been involved with a company, Sensory Inc., that produces low-cost neural-net-based ICs for speech recognition. We have sold several million units and have an 85% market share in dedicated speech recognition chips. The chip has been used in several dozen applications, including: toys, electronic learning aids, automobiles, consumer electronics, home appliances, light switches, telephones, and clocks. Due to cost constraints that limit RAM and processor speed, performing recognition with alternative approaches such as HMMs would not be feasible. The company web page sucks, but better ones are in the works with a listing of current products: www.sensoryinc.com. Don't blame me for the nonsensical jargon in the company literature. Mike Mozer From aminai at ececs.uc.edu Tue Aug 25 14:30:26 1998 From: aminai at ececs.uc.edu (Ali Minai) Date: Tue, 25 Aug 1998 14:30:26 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808251830.OAA03867@holmes.ececs.uc.edu> David Redish writes: In addition to the self-localization role, the hippocampus has been shown to replay recently traveled routes during sleep (Skaggs and McNaughton, 1996). However, the mechanisms that have been proposed to accomplish these two functions require incompatible connection matrices. Self-localization requires a symmetric component and route-replay requires an asymmetric component. We showed (Redish and Touretzky, 1998) that with the incorporation of external inputs representing spatial cues during self-localization (obviously necessary for accurate self-localization), self-localization can be accurate even with a weak asymmetric component, and that the weak asymmetric component is sufficient to replay the recently traveled routes (without the external input, which would presumably not be present during sleep). This shows that the two roles hypothesized for hippocampus are not incompatible. To add to David's very interesting comments on how self-localization and replay of learned sequences are not incompatible, I would point out that the hippocampal system has a variety of recurrent connection pathways at various hierarchical levels (e.g., CA3-CA3, dentate-hilus-dentate, entorhinal cortex-dentate-CA3-CA1-entorhinal cortex, etc.), and a variety of time-scales at which processes occur (e.g., a theta cycle, a gamma cycle, etc.) It is quite possible for functions requiring symmetric and asymmetric connectivities to coexist in the hippocampus if they occur in different subsystems and/or at different time-scales. I often find that apparent conflicts or trade-offs in modeling result from our neglecting hierarchical considerations (in both space and time) and the possibility of multiple modes of operation. There is plenty of evidence for both in the brain, but a lot of neural modeling still focuses on single time-scales, single (or perhaps 2 or 3) modes, and ``compact'' systems. The hippocampal models developed by Redish and Touretzky are especially interesting because they place the hippocampus in a larger, multi-level context with other systems. Others are starting to address issues of temporal hierarchies in models of memory recall, phase precession, etc. Also, on the point of episodic memory vs. cognitive mapping, it is possible to think of frameworks in which the two may appear as aspects (or even parts) of the same, more abstract, functionality. ----------------------------------------------------------------------------- Ali A. Minai Assistant Professor Complex Adaptive Systems Laboratory Department of Electrical & Computer Engineering and Computer Science University of Cincinnati Cincinnati, OH 45221-0030 Phone: (513) 556-4783 Fax: (513) 556-7326 Email: Ali.Minai at uc.edu Internet: http://www.ececs.uc.edu/~aminai/ ----------------------------------------------------------------------------- From jkolen at typhoon.coginst.uwf.edu Tue Aug 25 17:48:00 1998 From: jkolen at typhoon.coginst.uwf.edu (John F. Kolen) Date: Tue, 25 Aug 1998 16:48:00 -0500 Subject: Two Positions Available Message-ID: <9808251648.ZM7453@typhoon.coginst.uwf.edu> The Institute for Human and Machine Cognition (IHMC) at the University of West Florida has immediate openings for a visiting research scientist and a visiting research programmer. The successful visiting research scientist candidate must hold a Ph.D. in computer science (or equivalent qualification) and have a depth of knowledge in neural networks and computational modeling. Current projects include laser marksmen modeling, spectral analysis of inhomogeneous minerals, and image classification. The successful visiting research programmer candidate must hold a M.S. in computer science (or equivalent qualification) and have C++ programming experience. Knowledge of neural networks and computational modeling is expected. Experience with optics, image processing, spectrometry, geology, human performance, or modeling real-world data will be helpful. Both positions are contingent on project funding. Applicants may be asked to obtain a security clearence with the U.S. Department of Defense. Current projects include laser marksmen modeling, spectral analysis of heterogeneous minerals, and classification of geological formations from images. Many IHMC projects involve interdisciplinary team work, and we are looking for persons who enjoy collaboration with others. Currently, interdisciplinary research is underway in the computational and philosophical foundations of AI, computer-mediated communication and collaboration, smart machines in education, knowledge-based systems, multimedia browsers, fuzzy logic, neural networks, software agents, spatial and temporal reasoning, diagnostic systems, cognitive psychology, reasoning under uncertainty and the design of electronic spaces. Salaries will be commensurate with the levels of qualification. The IHMC was founded with legislative support in 1989 as an interdiscipliary research unit. Additionally, IHMC has succeeded in securing substantial extramural support and has established an enviable research and publication record. Visit the IHMC web page at http://www.coginst.uwf.edu for more information. The University of West Florida is situated in a 1000-acre protected nature preserve bordering the Escambia River, and is approximately 14 miles north of the country's finest white sand beaches. New Orleans is 3 hours away by car. Please send vita or resume to Prof. John F. Kolen Institute For Human and Machine Cognition University of West Florida 11000 University Pkwy. Pensacola, FL 32514 -- John F. Kolen voice: (850)474-3075 Assistant Professor fax: (850)474-3023 Dept. of Computer Science University of West Florida Pensacola, FL 32514 From max at currawong.bhs.mq.edu.au Tue Aug 25 22:54:03 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Wed, 26 Aug 1998 12:54:03 +1000 (EST) Subject: What have neural networks achieved? Message-ID: <199808260254.MAA11068@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 1099 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/09d1131e/attachment.ksh From joachim at moon.fit.qut.edu.au Tue Aug 25 23:51:51 1998 From: joachim at moon.fit.qut.edu.au (Joachim Diederich) Date: Wed, 26 Aug 1998 13:51:51 +1000 Subject: Connectionist symbolic processing Message-ID: <199808260351.NAA09637@moon.fit.qut.edu.au.fit.qut.edu.au> As a follow-up to Michael Healy's note on rule-extraction from neural networks, here are two more recent papers on the topic: Maire, F.: A partial order for the M-of-N rule-extraction algorithm. IEEE TRANSACTIONS ON NEURAL NETWORKS, Vol. 8, No .6, pages 1542-1544, November 1997. Tickle, A.B.; Andrews, R.; Golea, M.; Diederich, J.: The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE TRANSACTION ON NEURAL NETWORKS. Special Issue on Hybrid Systems. Scheduled for November 1998. We have a limited number of pre-prints available. Joachim Diederich ********************************************************************** Professor Joachim ("Joe") Diederich Director Machine Learning Research Centre (MLRC) Neurocomputing Laboratory / Data Mining Laboratory Queensland University of Technology _--_|\ Box 2434, Brisbane Q 4001 / QUT AUSTRALIA \_.--._/ Phone: +61 7 3864-2143 v Fax: +61 7 3864-1801 E-mail: joachim at fit.qut.edu.au or joachim at icsi.berkeley.edu WEB: http://www.fit.qut.edu.au/~joachim ********************************************************************** From ken at phy.ucsf.EDU Wed Aug 26 01:00:40 1998 From: ken at phy.ucsf.EDU (Ken Miller) Date: Tue, 25 Aug 1998 22:00:40 -0700 (PDT) Subject: function of hippocampus In-Reply-To: <199808251554.JAA15620@grey.colorado.edu> References: <199808251554.JAA15620@grey.colorado.edu> Message-ID: <13795.38520.152296.309141@coltrane.ucsf.edu> With respect to recent postings about models of hippocampus and memory, I'd like to toss in a cautionary note. A recent report (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) finds no deficit in tasks involving visual recognition memory or spatial memory with lesions of hippocampus and amygdala. Instead, deficits in both cases are associated with, and only with, lesion of the overlying rhinal cortex. They mention in the discussion evidence that "has suggested that the hippocampus may be more important for path integration on the basis of self-motion cues than for location memory, per se" (though Redish' recent posting mentions evidence against this from recent experiments of Alyan and McNaughton; I couldn't find a reference in medline). This is the latest in a series of reports along these lines from the Mishkin lab, who did much of the original lesion work that seemed to implicate hippocampus in memory. I'm not in any way an expert on this literature -- only a very distant observer -- but I worry that, based on lesion studies that also involved lesions of overlying cortex, both the neuroscience and connectionists communities may have jumped to a wrong conclusion that the hippocampus has a special role in episodic and/or spatial memory. I'd be interested to know if there's still good reason to believe in such a role ... Ken Kenneth D. Miller telephone: (415) 476-8217 Dept. of Physiology fax: (415) 476-4929 UCSF internet: ken at phy.ucsf.edu 513 Parnassus www: http://www.keck.ucsf.edu/~ken San Francisco, CA 94143-0444 From jlm at cnbc.cmu.edu Wed Aug 26 01:37:54 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Wed, 26 Aug 1998 01:37:54 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808260537.BAA07862@CNBC.CMU.EDU> Max Coltheart writes: > Randy O'Reilly said: > > Interestingly, one of the "sucesses" of neural networks in this case > was their dramatic failure in the form of the catestrophic > interference phenomenon. This failure tells us something about the > limitations of the cortical memory system, and thus, why we might need > a hippocampus. > > Think about the structure of this argument for a moment. It runs thus: > > 1. Neural networks suffer from catastrophic interference. > 2. Therefore the cortical memory system suffers from catastrophic > interference. > 3. That's why we might need a hippocampus. > > Is everyone happy with the idea that (1) implies (2)? Randy may not have provided quite a full enough description of the observations we made in the McClelland, McNaughton and O'Reilly article concerning what we called 'Complementary Learning Systems' in hippocampus and neocortex. The argument is quite a bit richer than Max's comment suggests, but I will endeavor to summarize it (for full justification and demonstration simulations, see the paper). The arguement, based on the successes as well as the failures of connectionist models of learning and memory, was this: The discovery of the structure present in large ensembles of events and experiences, such as e.g., the structure present in the relations between spelling and sound, requires what we called 'interleaved learning' --- learning in which the connection weights are adapted gradually so that the overall structure present in the ensemble can guide the learning process. It also requires the use of a componential coding scheme, which is essential for good generalization (this theme also appears in the Plaut et al paper, mentioned in my previous post in this discussion). We claimed the neocortex was specialized for structure-sensitive learning, and we observed that neural networks that exhibit this form of learning WOULD exhibit catastrophic interference IF forced to learn quickly, either by turning up the learning rate or by massive repetition of a very small and thus necessarily non-representative sample of training cases. Basically, what's simply happening is that the network is learning the non-representative structure present in the sample, at the expense of whatever it might previously have learned. Max and others might be interested to know that cortical memory systems have been shown to suffer from catestrophic-interference like effects. Massive repetition of a couple of tactile stimuli spanning several fingers can destroy the topographic map in somatosensory cortex (this is research from Merzenich's group). Generally, however, the cortex avoids catestrophic interference by using a relatively small learning rate, so that, in the normal course of events, the weights will reflect a sufficient sample of the environment. To allow rapid learning of the contents of a particular experience, the arguement goes, a second learning system, complementary to the first, is needed; such a system has a higher learning rate and recodes inputs using what we call 'sparse, random conjunctive coding' to minimize interference (while simultaneously reducing the adequacy of generalization). These characteristics are just the ones that appear to characterize the hippocampal system: it is the part of the brain known to be crucial for the rapid learning of the contents of specific experiences; it is massively plastic; and neuronal recording studies indicate that it does indeed use sparse, random conjunctive coding. Citations for the relevant articles follow. -- Jay McClelland -------------------------------------- @Article{McClellandMcNaughtonOReilly95, author = "McClelland, J. L. and McNaughton, B. L. and O'Reilly, R. C." , year = "1995" , title = "Why there are complementary learning systems in the hippocampus and neocortex: {Insights} from the successes and failures of connectionist models of learning and memory" , journal= "Psychological Review", volume = "102", pages = "419-457" } @Article{OReillyMcClelland94, author = "O'Reilly, R. C. and McClelland, J. L.", title = "Hippocampal conjunctive encoding, storage, and recall: {Avoiding} a tradeoff", journal = "Hippocampus", year = 1994, volume = 4, pages = "661-682" } @Article{McClellandGoddard96, author = "McClelland, J. L. and Goddard, N. H." , year = "1996" , title = "Considerations arising from a complementary learning systems perspective on hippocampus and neocortex" , journal = "Hippocampus" , volume = "6" , pages = "654--665" } @Article{PlautlETAL96, author = "Plaut, D. C. and McClelland, J. L. and Seidenberg, M. S. and Patterson, K. E.", title = "Understanding Normal and Impaired Word Reading: {Computational} Principles in Quasi-Regular Domains", journal = "Psychological Review", volume = "103", pages = "56-115", year = "1996" } Sorry, I'm not sure the correct citation for the Merzenich finding mentioned. It may be: @Article{WangMerzenichSameshimaJenkins95, author = {Wang, X. and Merzenich, M. M. and Sameshima, K. and Jenkins, W. M.}, title = {Remodelling of hand representation in adult cortex determined by timing of tactile stimulation}, journal = {Nature}, year = {1995}, volume = {378}, pages = {71-75} } From arbib at pollux.usc.edu Wed Aug 26 02:58:00 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Tue, 25 Aug 1998 22:58:00 -0800 Subject: Neural networks and brain function In-Reply-To: <199808251157.MAA01484@ylem.anat.ucl.ac.uk> Message-ID: Dear Dr. Harris: Thank you very much for your very helpful remarks. I would like to agree and disagree!! You write: >After neural networks, we have a different set of analogies. >We now make neurological models that ascribe a particular >computational function to a brain structure. For example: > > "The cerebellum performs supervised learning" > "The hippocampus functions as an autoassociative memory" > >By talking about a computational function, rather than a >type of task that a brain structure is needed for, a >lot of apparent conflict can suddenly be resolved. > >In the example of the cerebellum, the evidence that the cerebellum >is involved in motor control and classical conditioning, and >even higher cognitive functions does not seem so contradictory. >It is very plausible that a supervised learning network >would be useful for all of these functions -- see for example >the work of Kawato and Thompson. I agree - and in fact noted that the integration of work on the role of cerebellum in motor control and classical conditioning was a specific target of work at USC. You might reply: "Why is it a target? The problem is solved! The cerebellum does supervised learning! What more needs to be said?" And this is where I disagree - the observation that we might use supervised learning rather than Hebbian or reinforcement learning is only part of the issue under investigation. There are 2 complementary concerns: a) Many many functions can exploit supervised learning. Thus to show that A and B both use supervised learning in no way guarantees that the cerebellum carries out both of them - though I very much accept your point that the observation that they exploit the same learning mechanism seems a very useful step in that direction. b) Again, supervised learning can be realized in simple networks. We still have many questions to answer (we do have partial answers) as to why the cerebellar cortex has the structure it has, what might be the specific advantages of the actual mix of LTD and "re-potentiation" at the parallel-fiber-Purkinje cell synapse, and what is the relation between cerebellar cortex, cerebellar nucleus and inferior olive in a "generic" microcomplex. Even when we have answered that, we still have to ask whether - for posited function A or B - there is a set of cerebellar microcomplexes appropriately wired up to other brain regions to realize the supervised adaptation of that function. >In the example of the hippocampus, work by Michael Recce and >myself has shown how an autoassociative memory can play a role >in both episodic memory and spatial function, in particular >giving an animal localisation ability by performing pattern >completion on partial egocentric maps. I very much look forward to reading your paper! Nonetheless, the above observations also apply to autoassociative memory - this is not limited to HC. Conversely, recent work of ours suggests that, to fully understand its role in navigation, we must embed HC in a larger system including parietal cortex and other regions. It seems unlikely that the same pattern of embedding will account for episodic memory. So ... thank you for a stimulating general perspective. I look forward to other messages on brain modeling - both those adding to the stock of general principles, and those showing how the particularities of a system (LTP/LTD, neural morphology, embedding within larger networks of networks) account for its diverse functions. ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From mherrma at gwdg.de Wed Aug 26 05:37:14 1998 From: mherrma at gwdg.de (Michael Herrmann) Date: Wed, 26 Aug 1998 11:37:14 +0200 Subject: Staff Scientist Position Message-ID: <35E3D74A.41C6@gwdg.de> Staff Scientist Position in Theoretical Neuroscience Max-Planck-Institut fuer Stroemungsforschung Nonlinear Dynamics Group Goettingen, Germany The institute invites applications for a staff scientist position in its theory group. Candidates are expected to have excellent academic qualifications and a proven record of research in one or more of the following fields: computational neuroscience, theoretical brain research, systems neurobiology, adaptive behavior -- in addition to a good background in theoretical physics. Research at the institute focuses on mesoscopic systems in physics and biology and on computational neuroscience. The successful candidate is expected to carry out independent research in the field of her/his specialization and to collaborate with post-docs and graduate students. The group is closely affiliated with the University of Goettingen, where a "Habilitation" (secondary doctoral degree) may be pursued. For a detailed description of the research projects and resources of our group and for information about the city of Goettingen, please visit our WWW homepage at http://www.chaos.gwdg.de. The appointment will be for a fixed term of up to five years starting from September 1998 or later. Salary is according to the German BAT IIa/Ib bracket. The institute encourages applications from all qualified candidates, particularly women and persons with disabilities. Please send your application including CV, publication list (including up to three selected reprints), and a statement of your scientific interests and research plans as soon as possible to: Prof. Dr. Theo Geisel MPI fuer Stroemungsforschung Bunsenstrasse 10 D-37073 Goettingen, Germany From bryan at cog-tech.com Wed Aug 26 09:40:08 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Wed, 26 Aug 1998 09:40:08 -0400 Subject: What have neural networks achieved? In-Reply-To: <199808260254.MAA11068@currawong.bhs.mq.edu.au> (message from Max Coltheart on Wed, 26 Aug 1998 12:54:03 +1000 (EST)) Message-ID: <199808261340.JAA20766@cti2.cog-tech.com> Max, Think about the structure of this argument for a moment. It runs thus: 1. Neural networks suffer from catastrophic interference. 2. Therefore the cortical memory system suffers from catastrophic interference. 3. That's why we might need a hippocampus. Is everyone happy with the idea that (1) implies (2)? Max max at currawong.bhs.mq.edu.au I am not happy with the conclusion (1), above. Catastrophic interference is a function of the global quality of the weights involved in the network. More local networks are, of necessity, less prone to such interference as less overlapping subsets of the weights are used to maps the transformation from input to output space. Modifying some weights may have *no* effect on some other predictions. In the extreme case of table lookups, it is clear that catastropic interference completely disappears (along with most options for generalization, etc.:) In many ways, it seems that this statement is true for supervised learning networks in which weights are more global than not. Other, more practical counter examples would include (differentiable) CMACs and radial basis function networks. A related property is the degree to which a learned structure ossifies in a network, such that the network is, more or less, unable to respond to a changing environment. This is related to being stuck in local minima, but the solution may even have been optimal for the initial environmental conditions. Networks, or systems of networks, which permit multiple explanatory theories to be explored at once are less susceptible to both of these pitfalls (catastrophic interference and ossification, or loss of plasticity). The support of "multiple explanatory theories" opens the door to area which does not appear to receive enough attention: neural architectures which perform many-to-many mappings vs learn to estimate a weighted average of their target observations. For example, you are drawing samples from a jar of colored marbles, or prediction of the part-of- speech to follow in a sentence. Making the wrong prediction is not an error, it should just lead to updating the probability distribution over the possible outcomes. Averaging the representations and predicting, e.g., green(.8) + red(.2) => greed(1.0), is the error. So, are there good reasons for believing that "cortical memory system"s (a) exhibit these pitfalls (catastrophic interference, ossification or loss of plasticity, and averaging target observations). or (b) utilize architectures which minimize such effects. Clearly, the answer will be a mixture, but I believe that these effects are all minimized in our adaptive behaviors. -- bryan thompson bryan at cog-tech.com From cjcb at molson.ho.lucent.com Wed Aug 26 09:41:05 1998 From: cjcb at molson.ho.lucent.com (Chris Burges) Date: Wed, 26 Aug 1998 09:41:05 -0400 Subject: neural network success story Message-ID: <199808261341.JAA28643@cottontail.lucent.com> > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? Lucent Technologies sells the Lucent Courtesy Amount Reader (LCAR) to read financial amounts on US checks. This software is currently installed in a number of banks and is processing several million checks per day. LCAR reads machine print and handwritten amounts on both personal and business checks. The amount recognition algorithms are based on feed-forward convolutional neural networks. The basic ideas underlying the graph-based approach to both segmentation and neural network training can be found in: C.J.C. Burges, O. Matan, Y. Le Cun, J.S. Denker, L.D. Jackel, C.E. Stenard, C.R. Nohl, J.I. Ben, "Shortest Path Segmentation: A Method For Training a Neural Network to Recognize Character Strings", IJCNN Conference Proceedings Vol 3, pp. 165-172, 1992 J. Denker, C.J.C. Burges, "Image Segmentation and Recognition", in The Mathematics of Generalization: Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning, Addison Wesley, ISBN 0-201-40985-2, 1994 C.J.C. Burges, J.I. Ben, J.S. Denker, Y. LeCun and C.R. Nohl, "Off Line Recognition of Handwritten Postal Words Using Neural Networks", International Journal of Pattern Recognition and Artificial Intelligence, Vol. 7, Number 4, p. 689, 1993; also in Advances in Pattern Recognition Systems Using Neural Network Technologies, Series in Machine Perception and Artificial Intelligence, Volume 7, Edited by I. Guyon and P.S.P Wang, World Scientific, 1993. More recently, the graph-based approach has been significantly extended to allow end-to-end training of large, complex systems. For this see: Leon Bottou, Yoshua Bengio, Yann Le Cun, "Global Training of Document Processing Systems using Graph Transformer Networks", In Proceedings of Computer Vision and Pattern Recognition, Puerto Rico, IEEE, 1997. An extended paper discussing this will appear soon in Transactions of IEEE: Yann Le Cun, Leon Bottou, Yoshua Bengio, and Patrick Haffner, "Gradient Based Learning Applied to Document Recognition", to appear in Proceedings of IEEE. The underlying neural networks used by the system are the convolutional feed forward "LeNet" series. These are pretty well known by now. One place to go for a description, and a comparison with other algorithms, is: Yann Le Cun, Lawrence D. Jackel, Leon Bottou, Corinna Cortes, John S. Denker, Harris Drucker, Isabelle Guyon, Urs A. Muller, Eduard Sackinger, Patrice Simard, and Vladimir N. Vapnik. Learning algorithms for classification: A comparison on handwritten digit recognition. In J. H. Oh, C. Kwon, and S. Cho, editors, Neural Networks: The Statistical Mechanics Perspective, pages 261-276. World Scientific, 1995. There is quite a bit more to the LCAR system than is represented by these refs. (e.g. how to read handwritten fractional amounts), but those methods are not yet written up anywhere. However you can find a little more information on the LCAR system itself at http://www.lucent.dk/ssg/html/lcar.html. - Chris Burges burges at lucent.com From oreilly at grey.colorado.edu Wed Aug 26 13:08:40 1998 From: oreilly at grey.colorado.edu (Randall C. O'Reilly) Date: Wed, 26 Aug 1998 11:08:40 -0600 Subject: function of hippocampus In-Reply-To: <13795.38520.152296.309141@coltrane.ucsf.edu> (message from Ken Miller on Tue, 25 Aug 1998 22:00:40 -0700 (PDT)) References: <199808251554.JAA15620@grey.colorado.edu> <13795.38520.152296.309141@coltrane.ucsf.edu> Message-ID: <199808261708.LAA16289@grey.colorado.edu> Ken Miller writes: > With respect to recent postings about models of hippocampus and > memory, I'd like to toss in a cautionary note. A recent report > (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and > Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala > and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) > finds no deficit in tasks involving visual recognition memory or > spatial memory with lesions of hippocampus and amygdala. Instead, > deficits in both cases are associated with, and only with, lesion of > the overlying rhinal cortex. One general take on the division of labor between the rhinal cortex and the hippocampus proper is that the rhinal cortex can subserve "familiarity" based tasks (e.g., recognition), and the hippocampus is only necessary for actually recalling information. Familiarity might be subserved by priming-like, small weight changes that shift the relative balance of recently-activated representations. In contrast, the hippocampus proper seems particularly well suited for doing pattern completion, where a cue triggers the recall (completion) of a previously stored pattern. This requires storing a conjunctive representation that binds together all the elements of an event (so as to be recalled with a partial cue). Both familiarity and recollection can contribute to recognition memory, but having just familiarity can presumably get you pretty far. There is a growing literature that generally supports this distinction, some refs included below. I can't comment as to how this relates to spatial navigation. - Randy @article{Yonelinas97, author = {Yonelinas, A. P.}, title = {Recognition memory {ROCs} for item and associative information: The contribution of recollection and familiarity.}, journal = {Memory and Cognition}, pages = {747-763}, year = {1997}, volume = {25} } Aggleton & Brown, in press. Episodic Memory, Amnesia, and the Hippocampal-Anterior Thalamic Axis. Behavioral Brain Sciences, (penultimate draft available from BBS ftp site). @incollection{OReillyNormanMcClelland98, author = {O'Reilly, R. C. and Norman, K. A. and McClelland, J. L.}, editor = {Jordan, M. I. and Kearns, M. J. and Solla, S. A.}, title = {A Hippocampal Model of Recognition Memory}, booktitle = {Advances in Neural Information Processing Systems 10}, year = {1998}, publisher = {MIT Press}, address = {Cambridge, MA} } this is available as: ftp://grey.colorado.edu/pub/oreilly/papers/hip_rm_nips.ps +-----------------------------------------------------------------------------+ | Dr. Randall C. O'Reilly | | | Assistant Professor | | | Department of Psychology | Phone: (303) 492-0054 | | University of Colorado Boulder | Fax: (303) 492-2967 | | Muenzinger D251C | Home: (303) 448-1810 | | Campus Box 345 | email: oreilly at psych.colorado.edu | | Boulder, CO 80309-0345 | www: http://psych.colorado.edu/~oreilly | +-----------------------------------------------------------------------------+ From max at currawong.bhs.mq.edu.au Wed Aug 26 19:07:42 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Thu, 27 Aug 1998 09:07:42 +1000 (EST) Subject: What have neural networks achieved? Message-ID: <199808262307.JAA13303@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 5004 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/782f2d7c/attachment.ksh From horn at neuron.tau.ac.il Wed Aug 26 19:55:54 1998 From: horn at neuron.tau.ac.il (David Horn) Date: Thu, 27 Aug 1998 02:55:54 +0300 (IDT) Subject: What have neural networks achieved? Message-ID: Neuronal Regulation: A Complementary Mechanism to Hebbian Learning ------------------------------------------------------------------ We would like to point out the use of neural networks not for modelling a particular nucleus or cortical area, but for introducing and testing a general principle of information processing in the brain. Hebb can be viewed as the founder of this approach, suggesting a principle of how memories can be encoded in neuronal circuits. His ideas are still being tested today, and with the advent of knowledge regarding short term and long term synaptic plasticity, an understanding of learning and memory seems to be imminent. Yet there are still many open questions. One of the most interesting ones is the maintenance of memories over long times in the face of continuous synaptic metabolic turnover. We have recently studied this question theoretically (1) and concluded that to achieve long-term memory there has to exist a Neuronal Regulation mechanism with the following properties: 1. Multiplicative modification of all excitatory synapses projecting on a pyramidal neuron by a common, joint factor. 2. The magnitude of this regulatory neuronal factor changes inversely with respect to the neuron's post-synaptic potential, or the neuron's firing activity. In contrast to Hebbian changes, the synaptic modifications do not occur on the individual synaptic level as a function of the correlation between the firing of its pre and post-synaptic neurons, but take place in unison over all the synapses projecting on a neuron, as function of its membrane potential. In a series of very elegant slice experiments in rat, Turrigiano et al (2) have recently observed such phenomena. They find activity dependent changes in AMPA mediated mini EPSCs of pyramidal neurons. The regulatory process that they have observed has the features listed above. We believe that this newly observed mechanism serves as a complement to Hebbian synaptic learning. In our studies we found that it regulates basins of attraction of memories, thus preventing formation of pathologic attractors. Neuronal regulation may hence play a uniquely important role in preventing clinical and cognitive abnormalities like schizophrenic positive symptoms, that may result from the formation of such pathologic attractors (3,4). Activity-dependent neural regulatory processes have been previously observed experimentally (5) and studied theoretically (6,7). We were led to the problem of memory maintenance after first studying a neural model of Alzheimer's disease, where the late stages of regulatory processes, that are hypothesized to maintain cognitive function during normal aging, seem to fail (8). References: 1. D. Horn, N. Levy and E. Ruppin: Memory maintenance via neuronal regulation. Neural Computation, 10, 1-18 (1998). 2. G.G. Turrigiano, K. R. Leslie, N. S. Desai, L. C. Rutherford and S. B. Nelson: Activity-dependent scaling of quantal amplitude in neocortical neurons. Nature, 391, 892-895 (1998). 3. D. Horn and E. Ruppin: Compensatory mechanisms in an attractor neural network model of Schizophrenia. Neural Computation 7, 1494-1517 (1994). 4. E. Ruppin, J. Reggia and D. Horn: A neural model of positive schizophrenic symptoms. Schizophrenia Bulletin 22, 105-123 (1996). 5. G. LeMasson, E. Marder and L. F. Abbott: Activity-dependent regulation of conductances in model neurons. Science, 259, 1915-1917 (1993). 6. L. F. Abbott and G. LeMasson: Analysis of neuron models with dynamically regulated conductances. Neural Computation, 5, 823-842 (1993). 7. A. van Ooyen: Activity-dependent neural network development. Network, 5, 401-423 (1994). 8. D. Horn, N. Levy and E. Ruppin: Neuronal-based synaptic compensation: A computational study in Alzheimer's disease. Neural Computation 8, 1227-1243 (1996). David Horn Eytan Ruppin horn at neuron.tau.ac.il ruppin at math.tau.ac.il ---------------------------------------------------------------------------- Prof. David Horn horn at neuron.tau.ac.il School of Physics and Astronomy http://neuron.tau.ac.il/~horn Tel Aviv University Tel: ++972-3-642-9305, 640-7377 Tel Aviv 69978, Israel. Fax: ++972-3-640-7932 From terry at salk.edu Wed Aug 26 21:56:08 1998 From: terry at salk.edu (Terry Sejnowski) Date: Wed, 26 Aug 1998 18:56:08 -0700 (PDT) Subject: What have neural networks achieved? Message-ID: <199808270156.SAA06357@helmholtz.salk.edu> A footnote to Jay's last post on interference: One of the best established facts about memory is the spacing effect -- long term retention is much better for a wide variety of tasks and materials if the training is spaced in time rather than massed (cramming may help for the test tomorrow, but you won't remember it next year). Charlie Rosenberg showed that NETtalk exhibits a robust spacing effect when learning a new set of words. The explanation is similar to the one Jay has provided: You don't want to find the nearest place in weight space that codes the new words, but the nearest location in weight space that codes the old words and the new ones. Rosenberg, C. R. and Sejnowski, T. J., The effects of distributed vs massed practice on NETtalk, a massively-parallel network that learns to read aloud, Proceedings 8th Annual Conference of the Cognitive Science Society, Amherst, MA (August 1986). Whether the hippocampus is "replaying" recent experiences to the neocortex during sleep is an open question, though there is some evidence for this in rats. For further discussion see: Sejnowski, T. J., Sleep and memory, Current Biology 5, 832-834 (1995). There are high-amplitude thalamocortical rhythms that occur during sleep whose function is unknown. The mechanisms underlying these slow rhythms have been studied at the biophysical level, and incorporated into network models, which would qualify for a "success" story for understanding large-scale dynamical properties of brain systems: Steriade, M., McCormick, D. A., Sejnowski, T. J., Thalamocortical oscillations in the sleeping and aroused brain, Science 262, 679-685 (1993). Destexhe, A. and Sejnowski, T. J., Synchronized oscillations in thalamic networks: Insights from modeling studies, In: M. Steriade, E. G. Jones and D. A. McCormick (Eds.) Thalamus, Elsevier, pp 331-371 (1997). These models are a first step toward understanding the function of these oscillations, and perhaps someday the function of sleep, which remains a deep mystery. Regarding the comment by Ken Miller, the regions of the cortex that surround the hippocampus, including the entorhinal cortex, the perirhinal cortex and the parahippocampal cortex are staging areas for converging inputs to the hippocampus. Stuart Zola has shown that the severity of amnesia folowing lesions of these areas in monkeys is greater as more surrounding cortical areas are included in the lesion. The famous case of HM had surgical removal of the tmeporal lobe which included the areas surrounding the hippocampus. The view in the field is no longer to think of the hippocampus as the primary site but as part of a memory system in reciprocal interaction with these cortical areas. Other brain areas including frontal cortex and the cerebellum also are involved: Tulving E; Markowitsch HJ. Memory beyond the hippocampus. Current Opinion in Neurobiology, 1997 Apr, 7(2):209-16. Functional magentic resonance imaging is a powerful new tool for measuring activity and has been applied to memory systems -- see the latest results in the 21 August issue of Science. Terry ----- From aminai at ececs.uc.edu Thu Aug 27 01:18:27 1998 From: aminai at ececs.uc.edu (Ali Minai) Date: Thu, 27 Aug 1998 01:18:27 -0400 (EDT) Subject: function of hippocampus Message-ID: <199808270518.BAA06571@holmes.ececs.uc.edu> Ken Miller writes: With respect to recent postings about models of hippocampus and memory, I'd like to toss in a cautionary note. A recent report (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) finds no deficit in tasks involving visual recognition memory or spatial memory with lesions of hippocampus and amygdala.... I'm not in any way an expert on this literature -- only a very distant observer -- but I worry that, based on lesion studies that also involved lesions of overlying cortex, both the neuroscience and connectionists communities may have jumped to a wrong conclusion that the hippocampus has a special role in episodic and/or spatial memory. I'd be interested to know if there's still good reason to believe in such a role ... The concern expressed here is certainly warranted --- and not just on theories of hippocampal function. I think a lot of us are increasingly skeptical about theories of a simple, unitary function for the hippocampus. Indeed, different people often mean different things when they use the term ``hippocampus''. That having been said, I do think (and others can marshall the evidence better than I can) that a preponderance of evidence favors a hippocampal involvement in episodic memory and, at least in rodents, spatial cognition. The data from lobotomy patients such as H.M. and the extensive series of results from Squire's group provide convincing evidence that the hippocampus and its surrounding regions are involved in certain types of memory. Whether this role is central or peripheral (but important) is not clear, and I agree that most theories about the CA3 or the hippocampus as the site of associative storage --- temporary or otherwise --- are driven primarily by the intriguing structural analogies with recurrent neural networks. However, that is not necessarily a bad way to proceed. A major problem with experimental neuroscience is its tendency to produce oceans of data based on very narrowly focused experiments. Addressing this with formal large-scale theories provides a valuable --- if imperfect --- means of thinking about the big picture, and we need more of such theorizing. The issue of hippocampal involvement in spatial cognition in rodents is based on a very large body of lesion studies, but is given overwhelming credibility, in my opinion, by the undeniable existence of place cells and head-direction cells. The systematic study of sensory, behavioral, mnemonic, and other correlates of this organized cell activity provides convincing evidence that the hippocampus ``knows'' a great deal about the animal's spatial environment, is very sensitive to it, and responds robustly to disruptions of landmarks, etc. Recent reports on reconstructing physical location from place cell activity (Wilson and McNaughton, Science, 1993; Zhang et al., J. Neurophysiol., 1998; Brown et al., J. Neurosci, in press) clearly show that very accurate information about an animal's spatial position is available in the hippocampus. It must be used for something. Similar results are available about head direction cells. I do not think we really understand what role the rodent hippocampus plays in spatial cognition, but it is hard to dispute that it plays some --- possibly many --- important roles. I think that, as theories about hippocampal function begin to place the hippocampus in the larger context of other interconnected systems (e.g., in the work of Redish and Touretzky), we will move away from the urge to say, ``Here! This is what the hippocampus does'' and towards the recognition that it is probably an important part in a larger system for spatial cognition. Indeed, it is quite possible that, when we do arrive at a satisfactory explanation of hippocampal function, we will have no name for it in our current vocabulary (though I have no doubt that psychologists will invent one:-). Finally, one issue that is particularly relevant to hippocampal theories is the possibility that the categories of memory (e.g., episodic, declarative, etc.) or task (DNMS, spatial memory, working memory, etc.) that we use in our theorizing may not match up with the categories relevant to actual hippocampal functionality. Perhaps we are trying to build a science of chemistry based on air, water, fire, and earth. The good news is that the chemistry experiment was eventually successful and we did find our way to the correct elemental categories. Ali ----------------------------------------------------------------------------- Ali A. Minai Assistant Professor Complex Adaptive Systems Laboratory Department of Electrical & Computer Engineering and Computer Science University of Cincinnati Cincinnati, OH 45221-0030 Phone: (513) 556-4783 Fax: (513) 556-7326 Email: Ali.Minai at uc.edu Internet: http://www.ececs.uc.edu/~aminai/ From dwang at cis.ohio-state.edu Wed Aug 26 13:43:38 1998 From: dwang at cis.ohio-state.edu (DeLiang Wang) Date: Wed, 26 Aug 1998 13:43:38 -0400 Subject: What have neural networks achieved? Message-ID: <35E4494A.5C1@cis.ohio-state.edu> On the neuroscience front, a major success story of neural networks is the temporal (oscillatory) correlation theory, proposed and systematically advocated by Christoph von der Malsburg of USC and Univ. of Bochum. His pioneering theory was first described in 1981 (see below) in perhaps the most quoted technical report in neural networks (for a brief but earlier speculation along this line see Milner, 1974). His theory and prediction led to the two first confirmative reports by Echorn et al. (1988) and Gray et al. (1989). Since then numerous experiments have been conducted that confirm the theory (not without some controversy), including many papers published in Nature and Science (see Phillips and Singer, 1997, for a recent review). From rao at salk.edu Thu Aug 27 03:34:58 1998 From: rao at salk.edu (Rajesh Rao) Date: Thu, 27 Aug 1998 00:34:58 -0700 (PDT) Subject: Neural networks and brain function In-Reply-To: Message-ID: <199808270734.AAA23721@dale.salk.edu> > This failure tells us something about the limitations of the cortical > memory system, and thus, why we might need a hippocampus. Speaking of the cortex, some promising results have been obtained in recent years with regard to explaining cortical receptive field properties and interpreting cortical feedback/lateral connections using statistical principles such as maximum likelihood and Bayesian estimation. This line of research goes back to the early ideas of redundancy reduction and predictive coding advocated by Attneave (1954), MacKay (1956), and Barlow (1961). More recent incarnations of these ideas have been in the form of networks that attempt to learn sparse efficient codes (Olshausen and Field, Lewicki and Olshausen), networks that aim to maximize statistical independence of outputs (approaches of Bell and Sejnowski, van Hateren and Ruderman using ICA - http://www.cnl.salk.edu/~tewon/ica_cnl.html), networks that try to learn translation-invariant codes (Dana Ballard and myself), and networks that exploit biological constraints such as rectification for efficient coding (Lee and Seung, Hinton and Ghahramani, Dana Ballard and myself). Application of these algorithms to natural images produce spatial and spatiotemporal receptive field properties qualitatively similar to those observed in the visual cortex (the important and related line of research on correlation-based models of development by Ken Miller and others has already been mentioned in this thread). In the realm of hierarchical models, the early proposal of MacKay and more recently Mumford, ascribing to feedback connections the role of predicting or anticipating inputs, has been formalized in terms of learning generative models of input signals, the idea being that the feedback pathways might represent a learned statistical model of how the inputs are being generated ("synthesis" as opposed to "analysis" in the feedforward pathways). Examples include the work of Dayan, Hinton, Neal and Zemel (Helmholtz machine), Kawato, Hayakawa and Inui (forward-inverse optics model), Dana Ballard and myself (extended Kalman filter model), and related work by people such as Pece, Softky, Ullman, and others (I apologize if I inadvertently missed someone - please post a reply to add to this list). The work on hierarchical models is also closely related to the algorithms in the previous paragraph in that both rely on the idea of generative models, the differences being in the type of constraints imposed and the definition of statistical efficiency used. Although the results obtained thus far have been encouraging, the precise details regarding the neurobiological implementation of these algorithms in the cortex is far from clear. There is also a need for models that allow efficient learning of non-linear hierarchical generative models while at the same time respecting cortical neuroanatomical constraints. This gives me the excuse to advertise (somewhat shamelessly) a post-NIPS workshop on statistical theories of cortical function: the web page http://www.cnl.salk.edu/~rao/workshop.html contains more details and links to the web pages of some of the people pursuing this line of research. References: @article (Attneave54, author = "F. Attneave" , title = "Some informational aspects of visual perception", journal = "Psychological Review" , volume = "61" , number = "3" , year = "1954" , pages = "183-193" ) @incollection{MacKay56, author = "D. M. MacKay", title = "The epistemological problem for automata", editors = "C. E. Shannon and J. McCarthy", booktitle = "Automata Studies", pages = "235-251", publisher = "Princeton, NJ: Princeton University Press", year = "1956" } @incollection{Barlow61, author = "H. B. Barlow", title = "Possible principles underlying the transformation of sensory messages", editor = "W. A. Rosenblith", booktitle = "Sensory Communication", pages = "217-234", publisher = "Cambridge, MA: MIT Press", year = "1961" } (Other references to work mentioned above can be obtained from the web pages of the researchers - see the workshop page given above for some useful links). --- Rajesh P.N. Rao, Ph.D. Internet: rao at salk.edu The Salk Institute, CNL & Sloan Ctr VOX: 619-453-4100 x1215 10010 N. Torrey Pines Road FAX: 619-587-0417 La Jolla, CA 92037 WWW: http://www.cnl.salk.edu/~rao/ From jlm at cnbc.cmu.edu Thu Aug 27 07:57:45 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Thu, 27 Aug 1998 07:57:45 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: <199808262307.JAA13303@currawong.bhs.mq.edu.au> (message from Max Coltheart on Thu, 27 Aug 1998 09:07:42 +1000 (EST)) Message-ID: <199808271157.HAA01080@CNBC.CMU.EDU> Max Coltheart writes: To account for surface dyslexia (reading YACHT as "yatched", they stopped the training of the network before it had successfully learned low-frequency exception words such as this one, and postulated that in the normal reader such words can only be read aloud with input to phonology from a second system (semantics). Two problems with this: (a) it involves giving up the very thing that Jay says was an achievement: a single mechanism that can read aloud all exception words plus nonwords and (b) it predicts that anyone with severe semantic impairment will also show surface dyslexic reading, which is not the case; several recent papers have documented patients with very poor semantics but very good reading of exception words (e.g. Cipolotti & Warrington, J Int Neuropsych Soc 1995 1 104-110). The problems here are more apparent than real. First regarding (b), because of the fact that the spelling-sound mechanism in our model IS capable of learning both the regular and exception words correctly, our model is able to handle cases in which there is severe semantic impairment and no surface dyslexia (See Plaut 97 citation below). We view the extent of reliance on semantics in reading words aloud as an premorbid individual difference variable. Regarding (a), we do not relax the claim that a single (spelling-sound) mechanism CAN accout for reading of both regular and exception items, we only suggest that readers CAN ALSO read via meaning, and this allows the spelling-sound system to be lazy in acquiring the hardest items, namely the low frequency exceptions; the extent of the laziness becomes parameter dependent, and thus a natural place for individual differences to arise within the context of the model. All agree that our models should account for disorders as well as unimpaired performance. Our model does account for one thing that the standard dual route model does not account for, which is the fact that all fluent (see note) surface dyslexia patients show spared reading of high frequency exceptions. According to the dual-route approach, it ought to be possible to eliminate exception word reading entirely, but there are no fluent surface dyslexia patients who exhibit this pattern. -- Jay McClelland and Dave Plaut note: we hope we all agree that non-fluent SD patients are not relevant to this debate... sorry if this begins to get technical! @article ( Plaut97, key = "Plaut" , author = "David C. Plaut" , year = "1997" , title = "Structure and Function in the Lexical System: {Insights} from Distributed Models of Naming and Lexical Decision" , journal = LCP , volume = 12 , pages = "767-808" , keywords= "semantics, reading" ) From kdh at anatomy.ucl.ac.uk Thu Aug 27 10:24:28 1998 From: kdh at anatomy.ucl.ac.uk (Ken Harris) Date: Thu, 27 Aug 1998 15:24:28 +0100 Subject: Neural networks and brain function Message-ID: <199808271424.PAA08096@ylem.anat.ucl.ac.uk> Michael Arbib writes: > a) Many many functions can exploit supervised learning. Thus to show > that A and B both use supervised learning in no way guarantees that the > cerebellum carries out both of them. Agreed. > b) Again, supervised learning can be realized in simple networks. We > still have many questions to answer (we do have partial answers) as to > why the cerebellar cortex has the structure it has, what might be the > specific advantages of the actual mix of LTD and "re-potentiation" at > the parallel-fiber-Purkinje cell synapse, and what is the relation > between cerebellar cortex, cerebellar nucleus and inferior olive in a > "generic" microcomplex. Even when we have answered that, we still have > to ask whether - for posited function A or B - there is a set of > cerebellar microcomplexes appropriately wired up to other brain regions > to realize the supervised adaptation of that function. Again agreed. Simple connectionist networks can be no more than an analogy for the functioning of the brain. Although sometimes it is possible to model the function of a brain structure with out modelling its circuitry. For example, some of Kawato's simulations model the cerebellum by a backprop net. > Conversely, recent work of ours suggests that, to fully understand its > role in navigation, we must embed HC in a larger system including parietal > cortex and other regions. It seems unlikely that the same pattern of > embedding will account for episodic memory. I absolutely agree with the need to embed the hippocampus in a larger system. But I do think this can be done in a way consistent with a role in episodic memory. Without getting too deep into the details of our model: We propose that the neocortex is responsible for constructing and representing an egocentric map of space, i.e. a firing pattern that codes for the egocentric position of environmental features. The hippocampus is an autoassociative memory that performs pattern completion on egocentric maps, as well as on more general firing patterns. This function may explain the involvement of the hippocampus in certain spatial tasks, as well as in general episodic memory. In the example of the Morris water maze, a rat introduced to the maze constructs a partial map from sensory input, that will contain the positions of observable cues but not the hidden platform. This will trigger the recall of a full map stored in the hippocampus during previous exploration, that also contains a representation of the platform location. After recall, the neocortical firing pattern will contain a representation of the platform location, even though it was not directly observed. Neocortical motor systems then allow the rat to head directly towards the platform. ----------------------------------------------- Ken Harris Department of Anatomy and Developmental Biology University College London http://www.anat.ucl.ac.uk/~kdh From adr at nsma.arizona.edu Thu Aug 27 15:01:03 1998 From: adr at nsma.arizona.edu (David Redish) Date: Thu, 27 Aug 1998 12:01:03 -0700 Subject: function of hippocampus In-Reply-To: Your message of "Tue, 25 Aug 1998 22:00:40 MST." <13795.38520.152296.309141@coltrane.ucsf.edu> Message-ID: <199808271901.MAA23599@cortex.NSMA.Arizona.EDU> Ken Miller wrote: >With respect to recent postings about models of hippocampus and >memory, I'd like to toss in a cautionary note. A recent report >(Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and >Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala >and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) >finds no deficit in tasks involving visual recognition memory or >spatial memory with lesions of hippocampus and amygdala. Instead, >deficits in both cases are associated with, and only with, lesion of >the overlying rhinal cortex. They mention in the discussion evidence >that "has suggested that the hippocampus may be more important for >path integration on the basis of self-motion cues than for location >memory, per se" (though Redish' recent posting mentions evidence >against this from recent experiments of Alyan and McNaughton; I >couldn't find a reference in medline). This is the latest in a series >of reports along these lines from the Mishkin lab, who did much of the >original lesion work that seemed to implicate hippocampus in memory. > >I'm not in any way an expert on this literature -- only a very distant >observer -- but I worry that, based on lesion studies that also >involved lesions of overlying cortex, both the neuroscience and >connectionists communities may have jumped to a wrong conclusion that >the hippocampus has a special role in episodic and/or spatial memory. >I'd be interested to know if there's still good reason to believe in >such a role ... One should be very careful about taking anything in the primate literature as bearing on spatial navigation. All of the primate hippocampal recordings and all primate hippocampal lesions studies have used primates looking at a constellation of objects or being moved about the room in chairs. Rodent studies have shown that hippocampal lesions affect environmental dependent tasks much more so than object dependent tasks (see, for example, Cassaday and Rawlins, 1997). Also, rats restrained by towels or laying in hammocks and passively moved around the room do not show normal place fields (Foster et al. 1989, Gavrilov et al. 1996). The specific task used by Murray and Mishkin (1998) was to find food from one of two wells placed in front of the animal. This task is not "spatial navigation"; it is spatial reasoning. Major distinctions can be drawn between this task and the kind of hippocampal-dependent tasks used in rodent navigation. (1) The task can be solved by an egocentric spatial reasoning system, while the rodent hippocampus seems to be critically involved in allocentric spatial reasoning. (2) The task is dependent on small objects in front of the animal, while the rodent navigation tasks dependent on the hippocampus require manipulations of environmental context. But there is another very nice result from Murray and Mishkin (1996) that does bear on this issue: Alvarez et al. (1994) tested primates with hippocampal lesions in a delayed-non-match-to-sample task (DNMS). Alvarez et al. found that their hippocampally lesioned animals were impaired at long delays (10 minutes and 40 minutes), but not short delays (8 sec, 40 sec, 1 minute). They interpreted this difference as a consequence of the length of the delay. However, they did not use the same experimental paradigm for the short and long delay trials: for the longer trials, they removed the monkey from the apparatus, put it back in its home cage during the delay, and returned it to the apparatus after the delay. If the hippocampus is critical for the reinstantiation of context on returning to an environment, we might expect this removal from the environment to strongly affect the hippocampally lesioned animals (Nadel, 1995, Redish, 1997). Murray and Mishkin (1996) tested exactly this: they used a continuous-non-match-to-sample task (CNMS) which is similar to the DNMS task except that animals are shown a sequence of example objects and then shown novel pairs in reverse order. This means that although there is a delay between the time the animal sees the first object and when it sees the corresponding last pair, the animal never leaves the experimental situation. Murray and Mishkin showed that if the animals do not leave the context, then they can perform the task well even without a hippocampus. This environmental context-change is, I think, a better analogy to the rodent navigation literature. adr PS. The Alyan et al. 1997 reference is to a neuroscience abstract. I don't know the current status of the paper they are writing based on that work. REFERENCES P. Alvarez and L. R. Squire (1994) Memory consolidation and the medial temporal lobe: A simple network model, Proceedings of the National Academy of Sciences, USA, 91:7041-7045. H. J. Cassaday and J. N. P. Rawlins (1997) The hippocampus, objects, and their contexts, Behavioral Neuroscience, 111(6):1228-1244. T. C. Foster, C. A. Castro and B. L. McNaughton (1989) Spatial selectivity of rat hippocampal neurons: Dependence on preparedness for movement, Science, 244:1580-1582. V. V. Gavrilov, S. I. Wiener and A. Berthoz (1996) Discharge correlates of hippocampal neurons in rats passively displaced on a mobile robot, Society for Neuroscience Abstracts, 22:910. E. A. Murray and M. Mishkin (1996) 40-minute visual recognition memory in rhesus monkeys with hippocampal lesions, Society for Neuroscience Abstracts, 22:281. L. Nadel, The role of the hippocampus in declarative memory: A commentary on Zola-Morgan, Squire, and Ramus, 1994, Hippocampus, 5:232-234. A. D. Redish (1997) Beyond the Cognitive Map: Contributions to a Computational Neuroscience Theory of Rodent Navigation. PhD Thesis. Carnegie Mellon University. S. Zola-Morgan and L. R. Squire and S. J. Ramus (1994) Severity of memory impairment in monkeys as a function of locus and extent of damage within the medial temporal lobe memory system, 4:483-495. From dblank at comp.uark.edu Thu Aug 27 14:45:05 1998 From: dblank at comp.uark.edu (Douglas Blank) Date: Thu, 27 Aug 1998 13:45:05 -0500 Subject: Connectionist symbol processing: any progress? References: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> Message-ID: <35E5A931.F6E72BE0@comp.uark.edu> Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? For me, it is dead.  Implementing symbol processing in networks was a good first step in solving many problems that plagued symbolic systems. Tony Plate's HRR as applied to analogy is a great example (Plate, 1993). Using connectionist representations and methodologies, an expensive symbolic similarity estimation process was eliminated in the analogy-making MAC/FAC system (Gentner and Forbus, 1991). The bad news is that, in my opinion, the entire MAC/FAC model (like many symbolic models) has a fatal flaw and will never lead to an autonomous, flexible, creative, intelligent (analogy-making) machine. Even if Gentner's entire model were implemented completely in a network (or even real neurons), their problem would remain: the overall system organization is still "symbolic". Their method requires that analogies be encoded as symbols and structures, which leaves no room for perception or context effects during the analogy making process (for a detailed description of this problem, see Hofstadter, 1995). I believe that in order to solve the big AI/cognitive problems ahead (like making analogies), we, as modelers, will have to face a radical idea: we will no longer understand how our models solve a problem exactly. I mean that, for many complex problems, systems that solve them won't be able to be broken down into symbols and modules, and, therefore, there may not be a description of the solution more abstract than the actual solution itself. Some researchers have been focusing on solving high-level problems via a purely connectionist framework rather than augmenting a symbolic one. Meeden's planning system comes to mind, as does (warning: self-promotion) my own work in analogy-making (Meeden, 1994; Blank, 1997). Rather than focusing on some assumed-necessary symbolically-based process (say, variable binding) these models look at a bigger goal: modeling a complex behavior. Building and manipulating structured representations or binding variables via networks should not be our goals.* Neither should creating a model such that we can understand its inner workings.** Rather, we should focus on the techniques that allow a system to self-organize such that it can solve The Bigger Problems. I think much of the discussion on "learning to learn" has been related to this issue. > I'd love to hear some good news. For me, "connectionist symbol processing" was a very useful stage I went through as a cognitive scientist. Now I see that networks can do the equivalent of processing symbols, and not have anything to do with symbols. In addition, I learned that I can feel ok about not understanding exactly how they do it. -Doug Blank *Of course, building and manipulating structured representations or binding variable via nets is still useful for some problems, just not all of them. **The DOD is not interested in these types of systems. References Blank, D.S. (1997). "Learning to see analogies: a connectionist exploration." Unpublished PhD Thesis, Indiana University, Bloomington. http://dangermouse.uark.edu/~dblank/thesis.html Gentner, D., and Forbus, K. (1991). MAC/FAC: a model of similarity-based access and mapping. In "Proceedings of the Thirteenth Annual Cognitive Science Conference," 504-9. Hillsdale, NJ: Lawrence Erlbaum. Hofstadter, D., and FARG (1995). "Fluid concepts and creative analogies." Basic Books, new York, NY. Meeden, L. (1994) "Towards planning: incremental investigations into adaptive robot control." Unpublished PhD Thesis, Indiana University, Bloomington. http://www.cs.swarthmore.edu/~meeden/ Plate, T.A. (1991). Holographic reduced representations: convolution algebra for compositional distributed representations. In "Proceedings of the Twelfth International Joint Conference on Artificial Intelligence." Myopoulos, J. and Reiter, R. (Eds.), pp. 30-35. Morgan Kaufmann. -- ===================================================================== dblank at comp.uark.edu Douglas Blank, University of Arkansas Assistant Professor Computer Science ==================== http://www.uark.edu/~dblank ==================== From zorzi at univ.trieste.it Thu Aug 27 15:21:05 1998 From: zorzi at univ.trieste.it (Marco Zorzi) Date: Thu, 27 Aug 1998 20:21:05 +0100 Subject: What have neural networks achieved? In-Reply-To: <199808251826.OAA27165@CNBC.CMU.EDU> Message-ID: <3.0.5.32.19980827202105.007c2cd0@uts.univ.trieste.it> Jay McClelland writes: >There has been a great deal of connectionist work on the processing of >regular and exceptional material, initiated by the >Rumelhart-McClelland paper on the past tense. Debate has raged on the >subject of the past tense and work there is ongoing, but I won't claim >a success story there at this time. What I would like to point to >instead is the related topic of single word reading. Sejnowski and >Rosenberg's NETTALK first extended connectionist ideas to this issue, >and Seidenberg and McClelland went on to show that a connectionist >model could account in great detail for the pattern of reaction times >found in around 30 studies concerning the effects of regularity, >frequency, and lexical neighbors on reading words aloud. This was >followed by a resounding critique along the lines of Pinker and >Prince's critique of R&M, coming this time from Derrick Besner (and >colleagues) and Max Coltheart (and colleagues). Both pointed to the >fact that the S&M model didn't do a very good job of reading nonwords, >and both claimed that this reflected an in-principal limitation of a >connectionist, single mechanism account: To do a good job with both, >it was claimed, a dual route system was required. > >The success story is a paper by Plaut, McClelland, Seidenberg, and >Patterson, in which it was shown in fact that a single mechanism, >connectionist model can indeed account for human performance in >reading both words and nonwords. The model replicated all the S&M >findings, and at the same time was able to read non-words as well as >human subjects, showing the same types of neighbor-driven responses that >human readers show (eg MAVE is sometimes read to rhyme with HAVE >instead of SAVE). > >Of course there are still some loose ends but it is no longer possible >to claim that a single-mechanism account cannot capture the basic >pattern of word and non-word reading data. The demonstration that a single mechanism (ie a single, uniform network) can deal with both regular and exception items does not speak to the issue of which system humans are more likely to posses. For example, it is easy to constrain a single backpropagation network to perform both "what" and "where" vision tasks (Rueckl, Cave & Kosslyn, 1989), but the most efficient way to do it is through a modular architecture (Jacobs, Jordan, & Barto, 1991); incidentally, this is also what the brain seems be doing (in a very broad sense). This is the general (and important) issue of modular decomposition in learning (see Ghahramani & Wolpert, 1997, for recent evidence that the brain uses a modular decomposition strategy to learn a new visuomotor task). With regard to the more specific issue of regular vs. exception and/or words vs. non-words, a modular connectionist perspective (alternative to the approach of Plaut, McClelland, Seidenberg, & Patterson, 1996) can be found in papers (just appeared) by Zorzi, Houghton, and Butterworth (1998a, 1998b) for reading, and by Houghton and Zorzi (1998) for spelling (refs and abstracts below). The main point here is that the regularities of a "quasi-regular" domain such as reading or spelling are more easily and quickly exctracted by a network without hidden units and trained with the simple delta rule; this also provides early and robust generalization to novel forms (eg, non-words). The reading model has been shown to account for a wide range of empirical findings, including experimental, neuropsychological and developmental data. -- Marco Zorzi References: Ghahramani, Z., & Wolpert, D.M. (1997). Modular decomposition in visuomotor learning. Nature, 386, 392-395. Houghton, G., & Zorzi, M. (1998). A model of the sound-spelling mapping in English and its role in word and nonword spelling. In Proceedings of the Twentieth Annual Conference of the Cognitive Science Society (p. 490-501). Mahwah (NJ): Erlbaum. Jacobs, R.A., Jordan, M.I., & Barto, A.G. (1991). Task decomposition through competition in a modular connectionist architecture: The What and Where vision tasks. Cognitive Science, 15, 219-250. Plaut, D. C., McClelland, J. L., Seidenberg, M. S. & Patterson, K. E. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domain. Psychological Review, 103, 56-115. Rueckl, J.G., Cave, K.R., & Kosslyn, S.M. (1989). Why are "What" and "Where" processed by separate cortical visual systems? A computational investigation. Journal of Cognitive Neuroscience, 1, 171-186. Zorzi, M., Houghton, G., & Butterworth, B. (1988a). Two routes or one in reading aloud? A connectionist dual-process model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1131-1161. Zorzi, M., Houghton, G., & Butterworth, B. (1988b). The development of spelling-sound relationships in a model of phonological reading. Language and Cognitive Processes (Special Issue: Language Acquisition and Connectionsim), 13, 337-371. Two Routes or One in Reading Aloud? A Connectionist Dual-Process Model Marco Zorzi, George Houghton and Brian Butterworth Journal of Experimental Psychology: Human Perception and Performance, 1998, Vol. 24, No. 4, 1131-1161 A connectionist study of word reading is described that emphasizes the computational demands of the spelling-sound mapping in determining the properties of the reading system. It is shown that the phonological assembly process can be implemented by a two-layer network, which easily extracts the regularities in the spelling-sound mapping for English from training data containing many exception words. It is argued that productive knowledge about spelling-sound relationships is more easily acquired and used if it is separated from case-specific knowledge of the pronunciation of known words. It is then shown how the interaction of assembled and retrieved phonologies can account for the combined effects of frequency and regularity-consistency and for the reading performance of dyslexic patients. It is concluded that the organization of the reading system reflects the demands of the task and that the pronunciations of nonwords and exception words are computed by different processes. The development of spelling-sound relationships in a model of phonological reading. Marco Zorzi, George Houghton and Brian Butterworth Language and Cognitive Processes (Special Issue: Language Acquisition and Connectionsim), 1998, Vol. 13 (2/3), 337-371. Developmental aspects of the spelling to sound mapping for English monosyllabic words are investigated with a simple 2-layer network model using a simple, general learning rule. The model is trained on both regularly and irregularly spelled words, but extracts the regular spelling to sound relationships which it can apply to new words, and which cause it to regularize irregular words. These relationships are shown to include single letter to phoneme mappings as well as mappings involving larger units such as multi-letter graphemes and onset-rime structures. The development of these mappings as a function of training is analyzed and compared with relevant developmental data. We also show that the 2-layer model can generalize after very little training, in comparison to a 3-layer network. This ability relies on the fact that orthography and phonology can make direct contact with each other, and its importance for self-teaching is emphasized. A model of the sound-spelling mapping in English and its role in word and nonword spelling. George Houghton and Marco Zorzi In: Proceedings of the Twentieth Annual Conference of the Cognitive Science Society (p. 490-501), 1998. A model of the productive sound-spelling mapping in English is described, based on previous work on the analogous problem for reading (Zorzi, Houghton & Butterworth, 1998a, 1998b). It is found that a two-layer network can robustly extract this mapping from a representative corpus of English monosyllabic sound-spelling pairs, but that good performance requires the use of graphemic representations. Performance of the model is discussed for both words and nonwords, direct comparison being made with the spelling of surface dysgraphic MP (Behrmann & Bub, 1992). The model shows appropriate contextual effects on spelling and exactly reproduces many of the subject’s spellings. Effects of sound-spelling consistency are examined, and results arising from the interaction of this system with a lexical spelling system are compared with normal subject data. ---------------------------------------------------------------------- Marco Zorzi email: marco at psychol.ucl.ac.uk http://www.psychol.ucl.ac.uk/marco.zorzi/marco.html Department of Psychology voice: +44 171 5045393 University College London fax : +44 171 4364276 Gower Street London WC1E 6BT (UK) (and) Dipartimento di Psicologia voice: +39 40 6767325 Universita` di Trieste fax : +39 40 312272 via dell'Universita` 7 email: zorzi at univ.trieste.it 34123 Trieste (Italy) ---------------------------------------------------------------------- From jagota at cse.ucsc.edu Thu Aug 27 21:42:49 1998 From: jagota at cse.ucsc.edu (Arun Jagota) Date: Thu, 27 Aug 1998 18:42:49 -0700 Subject: a second, similar proposal Message-ID: <199808280142.SAA09611@arapaho.cse.ucsc.edu> As with my earlier call for archivable contributions on the "connectionist symbol processing" debate (the call generated a sufficient response to proceed to the implementation phase) it seems to me there are valuable postings also in the "big success stories" thread which would be nice to collect together into an article. In view of this, I make a similar proposal: If you have a "big success story" and are interested in making a half- to one-page archival contribution towards a "distributed" article that collects "big success stories", e-mail me your story in plain text. Sending me what you already posted on Connectionists (if you did) is fine. A mild preference to place all references in bibtex at the end and use latex commands where appropriate. A "big success story" may be in engineering applications or in understanding the brain. I intend to keep the two separate. Contributions would be rapidly reviewed for minimal content. Soft deadline: 9/7/98. The implementation phase will depend on quality and quantity of contributions. Should this phase be entered, the resulting article(s) will be archived in Neural Computing Surveys, http://www.icsi.berkeley.edu/~jagota/NCS Arun Jagota jagota at cse.ucsc.edu From ken at phy.ucsf.EDU Fri Aug 28 02:02:19 1998 From: ken at phy.ucsf.EDU (Ken Miller) Date: Thu, 27 Aug 1998 23:02:19 -0700 (PDT) Subject: What have neural networks achieved? Message-ID: <13798.18411.850138.465168@coltrane.ucsf.edu> >>>>> "-" == Terry Sejnowski writes: -> Regarding the comment by Ken Miller, the regions of the cortex that -> surround the hippocampus, including the entorhinal cortex, the -> perirhinal cortex and the parahippocampal cortex are staging areas -> for converging inputs to the hippocampus. Stuart Zola has shown -> that the severity of amnesia folowing lesions of these areas in monkeys is -> greater as more surrounding cortical areas are included in the lesion. -> The famous case of HM had surgical removal of the tmeporal lobe which -> included the areas surrounding the hippocampus. The view in the field is no -> longer to think of the hippocampus as the primary site but as part -> of a memory system in reciprocal interaction with these cortical -> areas. It's clear from the lesion studies that the surrounding pieces of cortex are involved in memory. The problem is, what's the evidence that hippocampus itself is involved in memory? -- given that when Mishkin lesions hippocampus, there is no deficit in visual recognition or spatial memory. Is the main evidence just that it's in heavy recipricol interaction with the places that *are* implicated by lesion studies in memory? Randy O'Reilly suggested Mishkin's results could be explained by postulating the overlying cortex can handle 'familiarity', and that is enough for the memory tasks studied. Whether it is enough for those tasks is a separate question, but even if so the logic, if I understand it right, seems a little tortured to me (though not impossible): lesioning overlying cortex affects measurements of memory bacause it destroys all those inputs to hippocampus; yet lesioning hippocampus itself doesn't affect those same measurements of memory bacause the overlying cortex can do some weak memory-like things by itself. It seems a pretty convoluted set of reasoning to preserve the idea that hippocampus is critical to memory, compared to the simpler conclusion that the overlying cortex is critical to memory and hippocampus just isn't involved. So again, is there positive evidence for *hippocampal* involvement in memory? There may well be some, but I don't think I've heard it yet ... don't mean to be argumentative, just really wondering ... Ken Kenneth D. Miller telephone: (415) 476-8217 Dept. of Physiology fax: (415) 476-4929 UCSF internet: ken at phy.ucsf.edu 513 Parnassus www: http://www.keck.ucsf.edu/~ken San Francisco, CA 94143-0444 From jlm at cnbc.cmu.edu Fri Aug 28 08:01:32 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Fri, 28 Aug 1998 08:01:32 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: <13798.18411.850138.465168@coltrane.ucsf.edu> (message from Ken Miller on Thu, 27 Aug 1998 23:02:19 -0700 (PDT)) Message-ID: <199808281201.IAA22968@CNBC.CMU.EDU> I reply to Ken's question: There are huge difficulties associated with the determination of whether or not hippocampal lesions produce deficits in memory, expecially when categorical labels are used such as 'recognition memory' or 'spatial memory' or 'episodic memory' or whatever. These difficulties include the fact that there are few lesions that are totally foolproof in terms of either their selectivity or their completeness. The difficulties also include the fact that the particular parameters of tasks often make a tremendous difference and labels such as recognition memory, spatial memory, etc really aren't fully adequate to characterize which tasks do and which tasks do not show deficits. It seems to be pretty well established in the rodent literature, however, that ibotinate lesions of the hippocampus (which leave fibers of passage intact) produce profound deficits in many spatial tasks (e.g., animal must learn to find submerged platform in a tank of milky water using cues around the room, and starting from a location in the tank that varies from trial to trial), although again the effects depend on the details of the experiments (the animal can learn if there's a lightbulb over the platform or if training and testing are always done with the same fixed starting place so a fixed path can be used). Much less clear is whether such lesions also produce deficits in non-spatial tasks. In my view the literature is consistent with the idea that the hippocampus is crucial for *rapid* learning that depends on conjuctions of cues; indeed I am one of many who think that the role of the hippocampus in spatial tasks is secondary to its role in using a form of coding we call sparse random conjunctive coding, but this matter is far from settled. There is an overview of the empricial data through 1994 in the McClelland, McNaughton and O'Reilly paper (Psychological Review, 1995, 102, 419-457). I append a couple of other relevant citations. -- Jay McClelland @Article{Jarrard93, author = "Jarrard, L. E.", title = "On the role of the hippocampus in learning and memory in the rat", journal = "Behavioral and Neural Biology", year = 1993, volume = 60, pages = "9-26" } @Article{RudySutherland9X, author = "Rudy, J. W. and Sutherland, R. W.", title = "Configural Association Theory and the Hippocampal Formation: {An} Appraisal and Reconfiguration", journal = "Hippocampus", year = "199?" %I think it's 95 or 96 -- jlm } From lemm at lorentz.uni-muenster.de Fri Aug 28 08:24:52 1998 From: lemm at lorentz.uni-muenster.de (Joerg_Lemm) Date: Fri, 28 Aug 1998 14:24:52 +0200 (MEST) Subject: Paper: A priori information, statistical mechanics Message-ID: The following technical report is available at: http://pauli.uni-muenster.de/~lemm or directly: http://pauli.uni-muenster.de/~lemm/papers/ann98.ps.gz or at the Los Alamos preprint server as cond-mat/9808039 (gzipped ps-file): http://xxx.lanl.gov/ps/cond-mat/9808039 Joerg C. Lemm: "How to Implement A Priori Information: A Statistical Mechanics Approach" Generalization abilities of empirical learning systems are essentially based on a priori information. The paper emphasizes the need of empirical measurement of a priori information by a posteriori control. A priori information is treated analogously to an infinite number of training data and expressed explicitly in terms of the function values of interest. This contrasts an implicit implementation of a priori information, e.g., by choosing a restrictive function parameterization. Different possibilities to implement a priori information are presented. Technically, the proposed methods are non--convex (non--Gaussian) extensions of classical quadratic and thus convex regularization approaches (or Gaussian processes, respectively). Specific topics discussed include approximate symmetries, approximate structural assumptions, transfer of knowledge and combination of learning systems. Appendix A compares concepts of statistics and statistical mechanics. Appendix B relates the paper to the framework of Bayesian decision theory. University of Muenster Publication No.: MS-TP1-98-12 Available at: http://pauli.uni-muenster.de/~lemm or directly: http://pauli.uni-muenster.de/~lemm/papers/ann98.ps.gz or at the Los Alamos preprint server as cond-mat/9808039 (gzipped ps-file): http://xxx.lanl.gov/ps/cond-mat/9808039 ======================================================================== Dr. Joerg Lemm Universitaet Muenster Email: lemm at uni-muenster.de Institut fuer Theoretische Physik I Phone: +49(251)83-34922 Wilhelm-Klemm-Str.9 Fax: +49(251)83-36328 D-48149 Muenster, Germany http://pauli.uni-muenster.de/~lemm ======================================================================== From harnad at coglit.soton.ac.uk Fri Aug 28 12:28:47 1998 From: harnad at coglit.soton.ac.uk (Stevan Harnad) Date: Fri, 28 Aug 1998 17:28:47 +0100 (BST) Subject: Barsalou on Perceptual Symbol Systems: BBS Call for Commentary Message-ID: Below is the abstract of a forthcoming BBS target article on: PERCEPTUAL SYMBOL SYSTEMS by Lawrence W. Barsalou This article has been accepted for publication in Behavioral and Brain Sciences (BBS), an international, interdisciplinary journal providing Open Peer Commentary on important and controversial current research in the biobehavioral and cognitive sciences. Commentators must be BBS Associates or nominated by a BBS Associate. To be considered as a commentator for this article, to suggest other appropriate commentators, or for information about how to become a BBS Associate, please send EMAIL to: bbs at cogsci.soton.ac.uk or write to: Behavioral and Brain Sciences Department of Psychology University of Southampton Highfield, Southampton SO17 1BJ UNITED KINGDOM http://www.princeton.edu/~harnad/bbs/ http://www.cogsci.soton.ac.uk/bbs/ ftp://ftp.princeton.edu/pub/harnad/BBS/ ftp://ftp.cogsci.soton.ac.uk/pub/bbs/ gopher://gopher.princeton.edu:70/11/.libraries/.pujournals If you are not a BBS Associate, please send your CV and the name of a BBS Associate (there are currently over 10,000 worldwide) who is familiar with your work. All past BBS authors, referees and commentators are eligible to become BBS Associates. To help us put together a balanced list of commentators, please give some indication of the aspects of the topic on which you would bring your areas of expertise to bear if you were selected as a commentator. An electronic draft of the full text is available for inspection with a WWW browser, anonymous ftp or gopher according to the instructions that follow after the abstract. ____________________________________________________________________ PERCEPTUAL SYMBOL SYSTEMS Lawrence W. Barsalou Department of Psychology Emory University Atlanta, GA 30322 http://userwww.service.emory.edu/~barsalou/ barsalou at emory.edu KEYWORDS: analogue processing, categories, concepts, frames, imagery, images, knowledge, perception, representation, sensory-motor representations, simulation, symbol grounding, symbol systems ABSTRACT: Prior to the twentieth century, theories of knowledge were inherently perceptual. Since then, developments in logic, statistics, and programming languages have inspired amodal theories that rest on principles fundamentally different from those underlying perception. In addition, perceptual approaches have become widely viewed as untenable, because they are assumed to implement recording systems, not conceptual systems. A perceptual theory of knowledge is developed here in the contexts of current cognitive science and neuroscience. During perceptual experience, association areas in the brain capture bottom-up patterns of activation in sensory-motor areas. Later, in a top-down manner, association areas partially reactivate sensory-motor areas to implement perceptual symbols. The storage and reactivation of perceptual symbols operates at the level of perceptual components--not at the level of holistic perceptual experiences. Through the use of selective attention, schematic representations of perceptual components are extracted from experience and stored in memory (e.g., individual memories of green, purr, hot). As memories of the same component become organized around a common frame, they implement a simulator that produces limitless simulations of the component (e.g., simulations of purr). Not only do such simulators develop for aspects of sensory experience, they also develop for aspects of proprioception (e.g., lift, run) and for introspection (e.g., compare, memory, happy, hungry). Once established, these simulators implement a basic conceptual system that represents types, supports categorization, and produces categorical inferences. These simulators further support productivity, propositions, and abstract concepts, thereby implementing a fully functional conceptual system. Productivity results from integrating simulators combinatorially and recursively to produce complex simulations. Propositions result from binding simulators to perceived individuals to represent type-token relations. Abstract concepts are grounded in complex simulations of combined physical and introspective events. Thus, a perceptual theory of knowledge can implement a fully functional conceptual system while avoiding what it is becoming increasingly apparent would be problems for amodal symbol systems. Implications for cognition, neuroscience, evolution, development, and artificial intelligence are explored. -------------------------------------------------------------- To help you decide whether you would be an appropriate commentator for this article, an electronic draft is retrievable from the World Wide Web or by anonymous ftp or gopher from the US or UK BBS Archive. Ftp instructions follow below. Please do not prepare a commentary on this draft. Just let us know, after having inspected it, what relevant expertise you feel you would bring to bear on what aspect of the article. The URLs you can use to get to the BBS Archive: http://www.princeton.edu/~harnad/bbs/ http://www.cogsci.soton.ac.uk/bbs/Archive/bbs.barsalou.html ftp://ftp.princeton.edu/pub/harnad/BBS/bbs.barsalou ftp://ftp.cogsci.soton.ac.uk/pub/bbs/Archive/bbs.barsalou gopher://gopher.princeton.edu:70/11/.libraries/.pujournals To retrieve a file by ftp from an Internet site, type either: ftp ftp.princeton.edu or ftp 128.112.128.1 When you are asked for your login, type: anonymous Enter password as queried (your password is your actual userid: yourlogin at yourhost.whatever.whatever - be sure to include the "@") cd /pub/harnad/BBS To show the available files, type: ls Next, retrieve the file you want with (for example): get bbs.barsalou When you have the file(s) you want, type: quit From zhuh at santafe.edu Fri Aug 28 17:37:15 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Fri, 28 Aug 1998 15:37:15 -0600 (MDT) Subject: What have neural networks achieved? In-Reply-To: Message-ID: As to the objective of a better understand of the brain, I'd like to draw your attention to a result about The Computational Origin of Addiction. A learning algorithm derived from purely computational considerations was shown to require a particular mechanism reminiscent to that provided by the neurotransmitter dopamine, including the possibility of addiction. I'd be especially interested in hearing responses from people familiar with neurophysiology and the role of dopamine. It might even be possible to test this theory with current experimental technology. ftp://ftp.santafe.edu/pub/zhuh/link-iconip97.ps A possible link between artificial and biological neural network learning rules Huaiyu Zhu A learning rule for stochastic neural networks is described, which corresponds to biological neural systems in all major aspects. Instead of backpropagating a vector through the synapses, only a few scalars are broadcast across the whole network, corresponding to the role played by the neurotransmitter dopamine. In addition, the annealing process avoids local optima in the learning process and corresponds to the difference in learning between adults and children. Some more detailed predictions are made for future comparison with neurophysiological data. (In Proc. Intl. Conf. Neural Information Processing (ICONIP'97), Vol.1, pp.263-266. Dunedin, New Zealand, 28-30 Nov, 1997) Huaiyu -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From larryy at pobox.com Fri Aug 28 18:41:11 1998 From: larryy at pobox.com (Larry Yaeger) Date: Fri, 28 Aug 1998 17:41:11 -0500 Subject: What have neural networks achieved? In-Reply-To: Message-ID: At 2:07 PM -0800 8/14/98, Michael A. Arbib wrote: >b) What are the "big success stories" (i.e., of the kind the general public >could understand) for neural networks contributing to the construction of >"artificial" brains, i.e., successfully fielded applications of NN hardware >and software that have had a major commercial or other impact? I sent a private comment to Michael Arbib, but since I never announced the availability of the comprehensive technical paper on connectionists, I'll briefly pipe up now. Though I wouldn't call it an "artificial brain", I would call it a successfully fielded application of NN software that had some degree of commercial and technological impact... The "Print Recognizer" in second and subsequent generation Newton PDAs was neural network based, and was fairly widely regarded as the first successful, truly usable handwriting recognition solution. (It had nothing whatsoever to do with the original handwriting recognition system in first generation Newtons.) When it was introduced, this handwriting recognizer essentially "saved" the Newton, breathing new life into the product and bringing a level of public acceptance of the device's primary input method (even though the product was killed a few years later). A fairly detailed technical paper on the subject is available in: Yaeger, L. S., Webb, B. J., Lyon, R. F., Combining Neural Networks and Context-Driven Search for On-Line, Printed Handwriting Recognition in the Newton, AI Magazine, AAAI, 19:1 (Spring 1998) p73-89. Or in preprint form at: Other information on this system and other, more basic research on evolving neural network architectures in a computational ecology can easily be found through my personal web site (URL in .sig below). - larryy ------------------------- "'I heard the Empire has a tyrannical and repressive government!' 'What form of government is that?' 'A tautology.'" ------------------------- "One of these days... Milkshake!... BANG!!" From dcrespin at euler.ciens.ucv.ve Sat Aug 29 07:25:25 1998 From: dcrespin at euler.ciens.ucv.ve (Daniel Crespin(UCV) Date: Sat, 29 Aug 1998 15:25:25 +0400 Subject: What have neural networks achieved? Message-ID: <199808291125.PAA20828@gauss.ucv.ve> About "What have neural networks achieved?", here is a condensed personal viewpoint, particularly about forward pass perceptron neural networks. As you will see below, I expect this e-mail to motivate not only thoughts but also certain concrete action. In order attain perspective, ask the following similar question: "What have computers achieved?" and compare with answers to the previous question. First came the birth of perceptrons. An elegant model for nervous systems, it caught lots of attention. Just after Hitler, the Holocaust and Hiroshima, the possibility of in-depth understanding of the human brain and behaviour could not pass unnoticed. But a persuasive book *against* perceptrons was written, and for some time they were left outside mainstream science. Then, backpropagation was created. A learning algorithm, a paradigm, a source of projects. The field of neural networks was (re)born. In the last analysis, backpropagation is just a special mixture of the gradient method (GM) and the chain rule, inheriting all the difficulties and shortcomings of the former. The classical picture of GM is: High dimensional landscapes with hills, saddles and valleys, starting at a random point and moving downhill towards a local minimum that one one does not know if it is a global one. Or perhaps wandering away towards infinity. Or unadvertedly jumping over the sought-after well. And then, to apply backpropagation, the network architecture has to be defined in advance, a task for which no efficient method has been available. Hence the random starting weights, and the random topology, or the "educated guess", or just the "guess". This means that lots of gaps are left to be filled, which may be good or bad, depending on projects and levels. Number crunching power has been a popular remedy, but the task is rather complex and results are still not satisfactory. This is, with considerable simplification, a possible sketch of the neuroscape to this date. The rather limited (as compared with computers in general) lists of successes previosly forwarded as answers to the Subject of this e-mail debate gives a rather good picture of what NN's achieved. Imagine now a new, powerful insight into (forward pass perceptron) neural networks. A whole new way to interpret and look at the popular diagrams of dots, arrows and weights, that gives you a useful and substantial picture of what a neural network is, what it does, what can you expect from it. As soon as data are gathered, your program creates a network and there you go. No more architecture or weight guessing. No more tedious backpropagation. No more thousands of "presentations". But wait. Why waste your time with hype? Not only the theory, but the software itself is readily available. Go the following URL: http://euler.ciens.ucv.ve/~dcrespin/Pub or http://150.185.69.150/~dcrespin/Pub Go there and download NEUROGON software. This is the action I expect to motivate. It is free for academic purposes and for any other non-profit use. The available version of NEUROGON can be greatly improved, but even thisrather limited version, once it is tested and used by workers on the field, could give rise to a much larger list of success stories on neural networks. Regards Daniel Crespin From istvan at usl.edu Sun Aug 30 14:53:59 1998 From: istvan at usl.edu (Dr. Istvn S. N. Berkeley) Date: Sun, 30 Aug 1998 13:53:59 -0500 Subject: Connectionist symbol processing: any progress? References: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> <35E5A931.F6E72BE0@comp.uark.edu> Message-ID: <35E99FC7.3025@USL.edu> Hi there, I am afraid that I can no longer resist adding my 2 cents worth to this debate. Douglas Blank wrote: > I believe that in order to solve the big AI/cognitive problems ahead > (like making analogies), we, as modelers, will have to face a radical > idea: we will no longer understand how our models solve a problem > exactly. I mean that, for many complex problems, systems that solve them > won't be able to be broken down into symbols and modules, and, > therefore, there may not be a description of the solution more abstract > than the actual solution itself. It seems to me that there is something fundamentally wrong about the proposal here. As McCloskey (1991) has argued, unless we can develop an understanding of how network models (or any kind of model for that matter) go about solving problems, they will not have any useful impact upon cognitive theorizing. Whilst this may not be a problem for those who wish to use networks merely as a technology, it surely must be a concern to those who wish to deploy networks in the furtherment of cognitive science. If we follow the suggestion made above then even successful attempts at modelling will be theoretically sterile, as we will be creating nothing more than 'black boxes'. This much having been said, the problem of interpreting and analysing trained network systems is not a trivial one, especially for large scale models. Although there are a variety of techniques which have been deployed (See Berkeley et al. 1995, Hanson and Burr 1990 and Elman 1990, for examples), none of them are entirely satisfactory, or universally applicable. Indeed, there has been some skepticism about the feasibility of trained network analysis in the literature (See Hecht-Nielsen 1990, Moser and Smolensky 1989 and Robinson, 1992). Nonetheless, if connectionist networks are to prove useful to cognitive science, continuing efforts to better understand mature networks are going to be crucial. A further point which needs to be raised (and sorry, this is where the self-advertising begins) is that some efforts at trained network analysis have turned up surprising results, which seem highly germaine to the topic of connectionist symbol processing. Some years ago myself and a number of members of The Biological Computation Project undertook the analysis of a network trained upon a logic problem originally studied by Bechtel and Abrahamsen (1991). Much to our surprise, our analysis showed that the network had developed stable patterns of hidden unit activation which closely mirrored the standard rules of inference from traditional sentential calculus, such as Modus Ponens. Moreover, we were able to make a number of useful and abstract generalizations about network functioning which were novel and informative. These results directly challenged the conclusions orignially drawn about the task by Bechtel and Abrahasen (1991). This work is described in detail in Berkeley et al. (1995). What this work suggests is that, rather than abandoning attempts at understanding mature networks, a more rational and productive path is to attempt to analyse in detail trained network function and then use the empirical results from such studies to inform judgements about connectionist symbol processing. All the best, Istvan Bibliography Bechtel, W. and Abrahamsen, A. (1991), *Connectionism and the Mind*, Basil Blackwell (Cambridge, MA). Berkeley, I., Dawson, M., Medler, D. Schopflocher, D. and Hornsby, L. (1995) "Density Plots of Hidden Value Unit Activations Reveal Interpretable Bands" in *Connection Science* 7/2, pp. 167-186. Elman, J. (1990), "Finding Structure in Time", in *Cognitive Science* 14, pp. 179-212. Hanson, S. and Burr, D. (1990), "What Connectionist Models Learn: Learning and Representation in Connectionist Networks" in Behavioral and Brain Sciences 13, pp. 471-518. Hecht-Nielsen, R. (1990), *Neurocomputation* Addison-Wesley Pub. Co. (New York). McCloskey, M. (1991), "Networks and Theories: The Place of Connectionism in Cognitive Science", *Psychological Science* 2/6, pp. 387-395. Mozer, M. and Smolensky, P. (1989), "Using Relevance to Reduce Network Size Automatically", in *Connection Science* 1, pp. 3-16. Robinson, D. (1992) "Implications of Neural Networks for How we Think about Brain Function" in Behavioral and Brain Sciences, 15, pp. 644-655. -- Istvan S. N. Berkeley Ph.D, E-mail: istvan at USL.edu, Philosophy, The University of Southwestern Louisiana, USL P. O. Box 43770, Lafayette, LA 70504-3770, USA. Tel:(318) 482 6807, Fax: (318) 482 6195, http://www.ucs.usl.edu/~isb9112 From adamidis at egnatia.ee.auth.gr Mon Aug 31 05:45:32 1998 From: adamidis at egnatia.ee.auth.gr (Panagiotis Adamidis) Date: Mon, 31 Aug 1998 12:45:32 +0300 Subject: New digital library Message-ID: <199808310945.MAA20151@egnatia.ee.auth.gr> ********Apologies if you receive multiple copies of this email********** Dear colleques, I'd like to inform you of a new "digital library" available on the following URL: http://www.it.uom.gr/pdp/digital.htm. It contains a lot of resources on the following subjects: Artificial Life, Complex Systems, Evolutionary Computation, Fuzzy Systems, Neural Networks, Parallel and Distributed Processing. Our initial intention was to provide a library (on topics of interest to our lab) usefull, effective, easy-to-use, up-to-date and attractive to both non-experts and specialists. We hope that the final(?) result is at least usefull. We intend to enhance and keep the library up-to-date. This is very difficult (maybe impossible) without the feedback from the users of this library. Your additions/corrections/deletions would be greatly appreciated. The library is maintained at the Parallel & Distributed Processing Lab. of the department of Applied Infomatics of Univ. of Macedonia, Thessaloniki, Greece. Hope you use it, and send us your feedback. Panagiotis Adamidis Associate researcher, Dept. of Applied Informatics, Univ. of Macedonia Thessaloniki, Greece. Email: adamidis at uom.gr From lorincz at iserv.iki.kfki.hu Mon Aug 31 06:55:44 1998 From: lorincz at iserv.iki.kfki.hu (Andras Lorincz) Date: Mon, 31 Aug 1998 12:55:44 +0200 (MET) Subject: Hippocampus and independent component analysis Message-ID: I would like to announce the availability of a new paper on the functional model of the hippocampus. The model fits smoothly the overall anatomical structure and its two-phase operational mode. The model is built on two basic postulates: (1) the entorhinal-hippocampal loop serves as a control loop with control errors initiating plastic changes in the hippocampus and (2) the hippocampal output develops independent components in the entorhinal cortex The paper AUTHOR: Andras Lorincz TITLE: Forming independent components via temporal locking of reconstruction architectures: a functional model of the hippocampus JOURNAL: Biological Cybernetics (in press) may be obtained from: http://iserv.iki.kfki.hu/New/pub.html Abstract: The assumption is made that the formulation of relations as independent components (IC) is a main feature of computations accomplished by the brain. Further, it is assumed that memory traces made of non-orthonormal ICs make use of feedback architectures to form internal representations. Feedback then leads to delays and delays in cortical processing form an obstacle to this relational processing. The problem of delay compensation is formulated as a speed-field tracking task and is solved by a novel control architecture. It is shown that in addition to delay compensation the control architecture can also shape long term memories to hold independent components if a two-phase operation mode is assumed. Features such as a trisynaptic loop and a recurrent collateral structure at the second stage of that loop emerge in a natural fashion. Based on these properties a functional model of the hippocampal loop is constructed. Andras Lorincz Department of Information Systems Eotvos Lorand University, Budapest, Hungary From rafal at idsia.ch Mon Aug 31 07:51:44 1998 From: rafal at idsia.ch (Rafal Salustowicz) Date: Mon, 31 Aug 1998 13:51:44 +0200 (MET DST) Subject: Hierarchical Probabilistic Incremental Program Evolution Message-ID: H-PIPE: FACILITATING HIERARCHICAL PROGRAM EVOLUTION THROUGH SKIP NODES Rafal Salustowicz Juergen Schmidhuber Technical Report IDSIA-8-98, IDSIA, Switzerland To evolve structured programs we introduce H-PIPE, a hierarchical extension of Probabilistic Incremental Program Evolution (PIPE). Structure is induced by "hierarchical instructions" (HIs) limited to top-level, structuring program parts. "Skip nodes" (SNs) inspired by biology's introns (non-coding segments) allow for switching program parts on and off. In our experiments H-PIPE out- performs PIPE, and SNs facilitate synthesis of certain structured programs but not unstructured ones. We conclude that introns can be particularly useful in the presence of structural bias. ftp://ftp.idsia.ch/pub/rafal/TR-8-98-H-PIPE.ps.gz http://www.idsia.ch/~rafal/research.html Short version: Evolving Structured Programs with Hierarchical Instructions and Skip Nodes. In J. Shavlik, ed., Machine Learning: Proceedings of the Fifteenth International Conference (ICML'98), pages 488-496, Morgan Kaufmann Publishers, San Francisco, 1998. ftp://ftp.idsia.ch/pub/rafal/ICML98_H-PIPE.ps.gz Rafal & Juergen, IDSIA www.idsia.ch From geoff at giccs.georgetown.edu Mon Aug 31 11:18:11 1998 From: geoff at giccs.georgetown.edu (Geoff Goodhill) Date: Mon, 31 Aug 1998 11:18:11 -0400 Subject: Postdoc position available Message-ID: <199808311518.LAA09521@fathead.giccs.georgetown.edu> POSTDOC IN DEVELOPMENTAL NEUROSCIENCE Georgetown Institute for Cognitive and Computational Sciences Georgetown University Washington DC A postdoctoral position is available immediately in the lab of Dr Geoff Goodhill to develop a novel experimental assay for the quantitative characterization of axon guidance mechanisms. This project is a collaboration with Dr Jeff Urbach (Physics, Georgetown) and Dr Linda Richards (Anatomy and Neurobiology, University of Maryland at Baltimore). An interest in neural development and a strong background in tissue culture techniques is required. Interest in theoretical models is a plus (see TINS, 21, 226-231). More information about the lab can be found at http://www.giccs.georgetown.edu/labs/cns Applicants should send a CV, a letter of interest, and names and addresses (including email) of at least two referees to: Dr Geoffrey J. Goodhill Georgetown Institute for Cognitive and Computational Sciences Georgetown University Medical Center 3970 Reservoir Road Washington DC 20007 Tel: (202) 687 6889 Fax: (202) 687 0617 Email: geoff at giccs.georgetown.edu From mac+ at andrew.cmu.edu Mon Aug 31 14:26:14 1998 From: mac+ at andrew.cmu.edu (Mary Anne Cowden) Date: Mon, 31 Aug 1998 14:26:14 -0400 (EDT) Subject: Carnegie Symposium on Mechanisms of Cognitive Development, Oct 9-11, 1998 Message-ID: =============================================================== CALL FOR PARTICIPATION The 29th Carnegie Symposium on Cognition Mechanisms of Cognitive Development: Behavioral and Neural Perspectives October 9 - 11, 1998 James L. McClelland and Robert S. Siegler, Organizers ---------------------------------------------------------------------------- The 29th Carnegie Symposium on Cognition is sponsored by the Department of Psychology and the Center for the Neural Basis of Cognition. The symposium is supported by the National Science Foundation, the National Institute of Mental Heatlh, and the National Institute of Child Health and Human Development. ---------------------------------------------------------------------------- This post contains the following entries relevant to the symposium: * Overview * Schedule of Events * Attending the Symposium * Travel Fellowships ---------------------------------------------------------------------------- Overview This symposium will consider how children's thinking evolves during development, with a focus on the role of experience in causing change. Speakers will examine the processes by which children learn and those that make children ready and able to learn at particular points in development, using both behavioral and neural approaches. Behavioral approaches will include research on the 'microgenesis' of cognitive change over short time periods (e.g., several hour-long sessions) in specific task situations. Research on cognitive change over longer time scales (months and years) will also be presented, as will research that uses computational modeling and dynamical systems approaches to understand learning and development. Neural approaches will include the study of how neuronal activity and connectivity change during acquisition of cognitive skills in children and adults. Other studies will consider the possible emergence of cognitive abilities through the maturation of brain structures and the effects of experience on the organization of functions in the brain. Developmental anomalies such as autism and attention deficit disorder will also be examined, as windows on normal development. Four questions will be examined throughout the symposium: 1) Why do cognitive abilities emerge when they do during development? 2) What are the sources of developmental and individual differences, and of developmental anomalies in learning? 3) What happens in the brain when people learn? 4) How can experiences be ordered and timed so as to optimize learning? The answers to these questions have strong implications for how we educate children and remediate deficits that impede development of thinking abilities. These implications will be explored in discussions among the participants. ---------------------------------------------------------------------------- The 29th Carnegie Symposium on Cognition: Schedule ---------------------------------------------------------------------------- Friday, October 9th: Studies of the Microgenesis of Cognitive Development 8:30 - 9:00 Continental Breakfast 9:00 Welcome BEHAVIORAL APPROACHES 9:20 Susan Goldin-Meadow, University of Chicago Giving the mind a hand: The role of gesture in cognitive change 10:20 Break 10:40 Robert Siegler, Carnegie Mellon University Microgenetic studies of learning in children and in brain-damaged adults 11:40 Lunch NEUROSCIENCE APPROACHES 1:00 Michael Merzenich, University of California, San Francisco Cortical plasticity phenomenology and mechanisms: Implications for neurorehabilitation 2:00 James L. McClelland, Carnegie Mellon University/CNBC Revisiting the critical period: Interventions that enhance adaptation to non-native phonological contrasts in Japanese adults 3:00 Break 3:20 Richard Haier, University of California, Irvine PET studies of learning and individual differences 4:20 Discussant: James Stigler, UCLA Saturday, October 10th: Studies of Change Over Long Time Scales 8:30 - 9:00 Continental Breakfast BEHAVIORAL APPROACHES 9:00 Esther Thelen, Indiana University Dynamic mechanisms of change in early perceptual motor development 10:00 Robbie Case, University of Toronto Differentiation and integration as the mechanisms in cognitive and neurological development 11:00 Break 11:20 Deanna Kuhn, Teacher's College, Columbia University Why development does (and doesn't) occur: Evidence from the domain of inductive reasoning 12:20 Lunch NEUROSCIENCE APPROACHES 2:00 Mark Johnson, Birkbeck College/University College London Cortical specialization for cognitive functions 3:00 Helen Neville, University of Oregon Specificity and plasticity in human brain development 4:00 Break 4:20 Discussant: David Klahr, Carnegie Mellon University Sunday, October 11th: Developmental Disorders 8:30 - 9:00 Continental Breakfast DYSLEXIA 9:00 Albert Galaburda, Harvard Medical School Toxicity of neural plasticity as seen through a model of learning disability AUTISM 10:00 Patricia Carpenter, Marcel Just, Carnegie Mellon University Cognitive load distribution in normal and autistic individuals 11:00 Break ATTENTION DEFICIT DISORDER 11:20 B. J. Casey, University of Pittsburgh Medical Center Disruption and inhibitory control in developmental disorders: A mechanistic model of implicated frontostriatal circuitry 12:20 Concluding discussant: Michael I. Posner, University of Oregon ---------------------------------------------------------------------------- Attending the Symposium Sessions on Friday, October 9 will be held in McConomy Auditorium, University Center, Carnegie Mellon. Sessions on Saturday, October 10 and Sunday, October 11 will be held in the Adamson Wing, Room 135 Baker Hall. Admission is free, and everyone is welcome to attend. Out of town visitors can contact Mary Anne Cowden, (412) 268-3151, mac+ at cmu.edu, for additional information. --------------------------------------------------------------------------- This material is based on the symposium web-page: http://www.cnbc.cmu.edu/carnegie-symposium ---------------------------------------------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mary Anne Cowden, Baker Hall 346 C Psychology Dept, Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, PA 15213 Phone: 412/268-3151 Fax: 412/268-3464 http://www.contrib.andrew.cmu.edu/~mac/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From zhuh at santafe.edu Mon Aug 31 16:01:23 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Mon, 31 Aug 1998 14:01:23 -0600 (MDT) Subject: Error decomposition and model complexity. In-Reply-To: Message-ID: The following paper has been submitted to Neural Computation: http://www.santafe.edu/~zhuh/draft/edmc.ps.gz Error Decomposition and Model Complexity Huaiyu Zhu Bayesian information geometry provides a general error decomposition theorem for arbitrary statistical models and a family of information deviations that include Kullback-Leibler information as a special case. When applied to Gaussian measures it takes the classical Hilbert space (Sobolev space) theories for estimation (regression, filtering, approximation, smoothing) as a special case. When the statistical and computational models are properly distinguished, the dilemmas of over-fitting and ``curse of dimensionality'' disappears, and the optimal model order disregarding computing cost is always infinity. Cited papers that have not appeared in print can be obtained through the web page below. -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From becker at curie.psychology.mcmaster.ca Mon Aug 31 16:25:37 1998 From: becker at curie.psychology.mcmaster.ca (Sue Becker) Date: Mon, 31 Aug 1998 16:25:37 -0400 (EDT) Subject: Calls for Participation: NIPS*98 Workshops Message-ID: Dear Connectionists, Below are brief annoucements of the 20 NIPS*98 workshops taking place in Breckenridge, Colorado on December 4-5 following the main conference in Denver. Many of these have published web pages with further details. See http://www.cs.cmu.edu/Groups/NIPS/1998/Workshops.html and the URLs listed below. Rich Zemel and Sue Becker, NIPS*98 Workshops Co-chairs ---------------------------------------------------------------------- DYNAMICS IN NETWORKS OF SPIKING NEURONS http://diwww.epfl.ch/w3lami/team/gerstner/NIPS_works.html Organizer: W. Gerstner (Lausanne, Switzerland) Networks of spiking neurons have several interesting dynamic properties, for example very rapid and characteristic transients, synchronous firing and asynchronous states. A better understanding of typical phenomena has important implications for problems associated with neuronal coding (spikes or rates). For example, the population activity is a rate-type quantity, but does not need temporal averaging - which suggests fast rate coding as a potential strategy. The idea of the workshop is to start from mathematical models of network dynamics, see what is known in terms of results, and then try to find out what the implications for 'coding' in the most general sense could be. ---------------------------------------------------------------------- POPULATION CODING Organizers: Glen D. Brown, The Salk Institute Kechen Zhang, The Salk Institute We will explore experimental approaches to population coding in three parts. First, we will examine techniques for recording from populations of neurons including electrode arrays and optical methods. Next, we will discuss spike-sorting and other issues in data analysis. Finally, we will examine strategies for interpreting population data, including population recordings from the hippocampus. To facilitate discussion, we are establishing a data base of neuronal-population recordings that will be available for analysis and interpretation. For more information, please contact Glen Brown (glen at salk.edu) or Kechen Zhang (zhang at salk.edu) Computational Neurobiology Laboratory The Salk Institute for Biological Studies 10010 North Torrey Pines Road La Jolla, CA 92037 ---------------------------------------------------------------------- TEMPORAL CODING: IS THERE EVIDENCE FOR IT AND WHAT IS ITS FUNCTION? http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/hatsopoulos.html Organizers: Nicho Hatsopoulos and Harel Shouval Brown University Departments of Neuroscience and Physics One of the most fundamental issues in neuroscience concerns the exact nature of neural coding or representation. The standard view is that information is represented in the firing rates of single or populations of neurons. Recently, a growing body of research has provided evidence for coding strategies based on more precise temporal relationships among spikes. These are some of the questions that the workshop intends to address: 1. What do we mean by temporal coding? What time resolution constitutes a temporal code? 2. What evidence is there for temporal coding in the nervous system. 3. What functional role does it play? What computational problem can it solve that firing rate cannot? 4. Is it feasible to implement given the properties of neurons and their interactions? We intend to organize it as a debate with formal presentations and informal discussion with some of the major figures in the field. Different views regarding this subject will be presented. We will invite speakers doing work in a variety of areas including both vertebrate and invertebrate systems. ---------------------------------------------------------------------- OPTICAL IMAGING OF THE VISUAL CORTEX http://camelot.mssm.edu/~udi Organizers: Ehud Kaplan, Gary Blasdel It is clear that any attempt to model brain function or development will require access to data about the spatio-temporal distribution of activity in the brain. Optical imaging of the brain provides a unique opportunity to obtain such maps, and thus is essential for scientists who are interested in theoretical approaches to neuroscience. In addition, contact of biologists with theoretical approaches could help them focus their studies on the essential theoretical questions, and on new computation, mathematical, or theoretical tools and techniques. We therefore organized a 6-hour workshop on optical imaging of the cortex, to deal with both technical issues and physiological results. The workshop will have the format of a mini-symposium, and will be chaired by Ehud Kaplan (Mt. Sinai School of Medicine) and Gary Blasdel (Harvard). Technical issues to be discussed include: 1. What is the best way to extract faint images from the noisy data? 2. How does one compare/relate functional maps? 3. What is the best wavelength for reflectance measurements? 4. What is the needed (or possible) spatial resolution? 5. How do you deal with brain movement and other artifacts? See also: http://camelot.mssm.edu/~udi ---------------------------------------------------------------------- OLFACTORY CODING: MYTHS, MODELS AND DATA http://www.wjh.harvard.edu/~linster/nips98.html Organizers: Christane Linster, Frank Grasso and Wayne Getz Currently, two main models of olfactory coding are competing with each other: (1) the selective receptor, labeled line model whish has been popularized by recent results from molecular biology, and (2), the non-selective receptor, distributive coding model, supported mainly by data from electrophysiology and imaging in the olfactory bulbs. In this workshop, we will discuss experimental evidence for each model. Theorticians and experimentalists together will discuss the implications for olfactory codoing and for neural porprocessing in the olfactory bulb and cortex for each of the two predominant, and possibly, intermediate, models. ---------------------------------------------------------------------- STATISTICAL THEORIES OF CORTICAL FUNCTION http://www.cnl.salk.edu/~rao/workshop.html Organizers: Rajesh P.N. Rao, Salk Institute (rao at salk.edu) Bruno A. Olshausen, UC Davis (bruno at redwood.ucdavis.edu) Michael S. Lewicki, Salk Institute (lewicki at salk.edu) Participants are invited to attend a post-NIPS workshop on theories of cortical function based on well-defined statistical principles such as maximum likelihood and Bayesian estimation. Topics that are expected to be addressed include: statistical interpretations of the function of lateral and cortico-cortical feedback connections, theories of perception and neural representations in the cortex, and development of cortical receptive field properties from natural signals. For further details, see: http://www.cnl.salk.edu/~rao/workshop.html ---------------------------------------------------------------------- LEARNING FROM AMBIGUOUS AND COMPLEX EXAMPLES Organizers: Oded Maron, PHZ Capital Partners Thomas Dietterich, Oregon State University Frameworks such as supervised learning, unsupervised learning, and reinforcement learning have many established algorithms and theoretical tools to analyze them. However, there are many learning problems that do not fall into any of these established frameworks. Specifically, situations where the examples are ambiguously labeled or cannot be simply represented as a feature vector tend to be difficult for these frameworks. This workshop will bring together researchers who are interested in learning from ambiguous and complex examples. The workshop will include, but not be limited to, discussions of Multiple-Instance Learning, TDNN, bounded inconsistency, and other frameworks for learning in unusual situations. ---------------------------------------------------------------------- TURNKEY ALGORITHMS FOR IMPROVING GENERALIZERS http://ic.arc.nasa.gov/ic/people/kagan/nips98.html Organizers: Kagan Tumer and David Wolpert NASA Ames Research Center Abstract: Methods for improving generalizers, such as stacking, bagging, boosting and error correcting output codes (ECOCs) have recently been receiving a lot of attention. We call such techniques "turnkey" techniques. This reflects the fact that they were designed to improve the generalization ability of generic learning algorithms, without detailed knowledge about the inner workings of those learners. Whether one particular turnkey technique is, in general, "better" than all others, and if so under what circumstances, is a hotly debated issue. Furthermore, it isn't clear whether it is meaningful to ask that question without specific prior assumptions (e.g., specific domain knowledge). This workshop aims at investigating these issues, building a solid understanding of how and when turnkey techniques help generalization ability, and lay out a road map to where the turnkey methods should go. ---------------------------------------------------------------------- MINING MASSIVE DATABASES: SCALABLE ALGORITHMS FOR DATA MINING http://research.microsoft.com/~fayyad/nips98/ Organizers: Usama Fayyad and Padhraic Smyth With the explosive growth in the number of "data owners", interest in scalable, integrated, data mining tools is reaching new heights. This 1-day workshop aims at bringing together researchers and practitioners from several communities to address topics of mutual interest (and misunderstanding) such as: scaling clustering and prediction to large databases, robust algorithms for high dimensions, mathmatical approaches to mining massive datasets, anytime algorithms, and dealing with discrete, mixed, and multimedia (unstructured) data. The invited talks will be used to drive discussion around the issues raised, common problems, and definitions of research problems that need to be addressed. Important questions include: why the need for integration with databases? why deal with massive data stores? What are most effective ways to scale algorithms? How do we help unsophisticated users visualize the data/models extracted? Contact information: Usama Fayyad (Microsoft Research), Fayyad at microsoft.com, http://research.microsoft.com/~fayyad Padhraic Smyth (U.C. Irvine), Smyth at sifnos.ics.uci.edu, http://www.ics.uci.edu/~smyth/ ---------------------------------------------------------------------- INTEGRATING SUPERVISED AND UNSUPERVISED LEARNING www.cs.cmu.edu/~mccallum/supunsup Organizers: Rich Caruana, Just Research Virginia de Sa, UCSF Andrew McCallum This workshop will debate the relationship between supervised and unsupervised learning. The discussion will run the gamut from examining the view that supervised learning can be performed by unsupervised learning of the joint distribution between the inputs and targets, to discussion of how natural learning systems do supervised learning without explicit labels, to the presentation of practical methods of combining supervised and unsupervised learning by using unsupervised clustering or unlabelled data to augment a labelled corpus. The debate should be fun because some attendees believe supervised learning has clear advantages, while others believe unsupervised learning is the only game worth playing in the long run. More information (including a call for abstracts) can be found at www.cs.cmu.edu/~mccallum/supunsup. ---------------------------------------------------------------------- LEARNING ON RELATIONAL DATA REPRESENTATIONS http://ni.cs.tu-berlin.de/nips98/ Organizers: Thore Graepel, TU Berlin, Germany Ralf Herbrich, TU Berlin, Germany Klaus Obermayer, TU Berlin, Germany Symbolic (structured) data representations such as strings, graphs or logical expressions often provide a more natural basis for learning than vector space representations which are the standard paradigm in connectionism. Symbolic representations are currently subject to an intensive discussion (cf. the recent postings on the connectionist mailing list), which focuses on the question if connectionist models can adequately process symbolic input data. One way of dealing with structured data is to characterize them in relation to each other. To this end a set of data items can be characterized by defining a dissimilarity or distance measure on pairs of data items and to provide learning algorithms with a dissimilarity matrix of a set of training data. Prior knowledge about the data at hand can be incorporated explicitly in the definition of the dissimilarity measure. One can even go as far as trying to learn a distance measure appropriate for the task at hand. This procedure may provide a bridge between the vector space and the "structural" approaches to pattern recognition and should thus be of interest to people from both communities. Additionally, pairwise and other non-vectorial input data occur frequently in empirical sciences and pose new problems for supervised and unsupervised learning techniques. More information can be found at http://ni.cs.tu-berlin.de/nips98/ ------------------------------------------------------------------ SEQUENTIAL INFERENCE AND LEARNING http://svr-www.eng.cam.ac.uk/~jfgf/workshop.html Organizers: Mahesan Niranjan, Cambridge University Engineering Department Arnaud Doucet, Cambridge University Engineering Department Nando de Freitas, Cambridge University Engineering Department Sequential techniques are important in many applications of neural networks involving real-time signal processing, where data arrival is inherently sequential. Furthermore, one might wish to adopt a sequential training strategy to deal with non-stationarity in signals, so that information from the recent past is lent more credence than information from the distant past. Sequential methods also allow us to efficiently compute important model diagnostic tools such as the one-step-ahead prediction densities. The advent of cheap and massive computational power has stimulated many recent advances in this field, including dynamic graphical models, Expectation-Maximisation (EM) inference and learning for dynamical models, dynamic Kalman mixture models and sequential Monte Carlo sampling methods. More importantly, such methods are being applied to a large number of interesting real problems such as computer vision, econometrics, medical prognosis, tracking, communications, blind deconvolution, statistical diagnosis, automatic control and neural network training. _______________________________________________________________________________ ABSTRACTION AND HIERARCHY IN REINFORCEMENT LEARNING http://www-anw.cs.umass.edu/~dprecup/call_for_participation.html Organizers: Tom Dietterich, Oregon State University Leslie Kaelbling, Brown University Ron Parr, Stanford University Doina Precup, University of Massachusetts, Amherst When making everyday decisions, people are able to foresee the consequences of their possible courses of action at multiple levels of abstraction. Recent research in reinforcement learning (RL) has focused on the way in which knowledge about abstract actions and abstract representations can be incorporated into the framework of Markov Decision Processes (MDPs). Several theoretical results and applications suggest that these methods can improve significantly the scalability of reinforcement learning systems by accelerating learning and by promoting sharing and re-use of learned subtasks. This workshop aims to address the following issues in this area: - Task formulation and automated task creation - The degree and complexity of action models - The integration of different abstraction methods - Hidden state issues - Utility and computational efficiency considerations - Multi-layer abstractions - Temporally extended perception - The design of autonomous agents based on hierarchical RL architectures We are looking for volunteers to lead discussions and participate in panels. We will also accept some technical papers for presentations. For more details, please check out the workshop page: http://www-anw.cs.umass.edu/~dprecup/call_for_participation.html ---------------------------------------------------------------------- MOVEMENT PRIMITIVES: BUILDING BLOCKS FOR LEARNING MOTOR CONTROL http://www-slab.usc.edu/events/nips98 Organizers: Stefan Schaal (USC/ERATO(JST)) and Steve DeWeerth (GaTech) Traditionally, learning control has been dominated by representations that generate low level actions in response to some measured state information. The learning of appropriate trajectory plans or control policies is usually based on optimization approaches and reinforcement learning. It is well known that these methods do not scale well to high dimensional control problems, that they are computationally very expensive, that they are not particularly robust to unforeseen perturbations in the environment, and that it is hard to re-use these representations for related movement tasks. In order to make progress towards a better understanding of biology and to create movement systems that can automatically build new representations, it seems to be necessary to develop a framework of how to control and to learn control with movement primitives. This workshop will bring together neuroscientists, roboticists, engineers, and mathematicians to explore how to approach the topic of movement primitives in a principled way. Topics of the workshop include the questions such as: what are appropriate movement primitives, how are primitives learned, how can primitives be inserted into control loops, how are primitives sequenced, how are primitives combined to form new primitives, how is sensory information used to modulate primitives, how primitives primed for a particular task, etc. These topics will be addressed from a hybrid perspective combining biological and artificial movement systems. ---------------------------------------------------------------------- LARGE MARGIN CLASSIFIERS http://svm.first.gmd.de/nips98/ Organizers: Alex J. Smola, Peter Bartlett, Bernhard Schoelkopf, Dale Schuurmans Many pattern classifiers are represented as thresholded real-valued functions, eg: sigmoid neural networks, support vector machines, voting classifiers, and Bayesian schemes. Recent theoretical and experimental results show that such learning algorithms frequently produce classifiers with large margins---where the margin is the amount by which the classifier's prediction is to the correct side of threshold. This has led to the important discovery that there is a connection between large margins and good generalization performance: classifiers that achieve large margins on given training data also tend to perform well on future test data. This workshop aims to provide an overview of recent developments in large margin classifiers (ranging from theoretical results to applications), to explore connections with other methods, and to identify directions for future research. The workshop will consist of four sessions over two days: - Mathematical Programming - Support Vector and Kernel Methods, - Voting Methods (Boosting, Bagging, Arcing, etc), and - Connections with Other Topics (including an organized panel discussion) Further details can be found at http://svm.first.gmd.de/nips98/ ---------------------------------------------------------------------- DEVELOPMENT AND MATURATION IN NATURAL AND ARTIFICIAL STRUCTURES http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/haith.html Organizers: Gary Haith, Computational Sciences, NASA Ames Research Center Jeff Elman, Cognitive Science, UCSD Silvano Colombano, Computational Sciences, NASA Ames Research Center Marshall Haith, Developmental Psychology, University of Denver We believe that an ongoing collaboration between computational work and developmental work could help unravel some of the most difficult issues in each domain. Computational work can address dynamic, hierarchical developmental processes that have been relatively intractable to traditional developmental analysis, and developmental principles and theory can generate insight into the process of building and modeling complex and adaptive computational structures. In hopes of bringing developmental processes and analysis into the neural modeling mainstream, this session will focus developmental modelers and theorists on the task of constructing a set of working questions, issues and approaches. The session will hopefully include researchers studying developmental phenomena across all levels of scale and analysis, with the aim of highlighting both system-specific and general features of development. For more information, contact: Gary Haith, Computational Sciences, NASA Ames Research Center phone #: (650) 604-3049 FAX #: (650) 604-3594 E-mail: haith at ptolemy.arc.nasa.gov Mail: NASA Ames Research Center Mail Stop 269-3 Mountain View, CA 94035-1000 ---------------------------------------------------------------------- HYBRID NEURAL SYMBOLIC INTEGRATION http://osiris.sunderland.ac.uk/~cs0stw/wermter/workshops/nips-workshop.html Organizers: Stefan Wermter, University of Sunderland, UK Ron Sun, University of Alabama, USA In the past it was very controversial whether neural or symbolic approaches alone will be sufficient to provide a general framework for intelligent processing. The motivation for the integration of symbolic and neural models of cognition and intelligent behavior comes from many different sources. From the perspective of cognitive neuroscience, a symbolic interpretation of an artificial neural network architecture is desirable, since the brain has a neuronal structure and the capability to perform symbolic processing. From the perspective of knowledge-based processing, hybrid neural/symbolic representations are advantageous, since different mutually complementary properties can be integrated. However, neural representations show advantages for gradual analog plausibility, learning, robust fault-tolerant processing, and generalization to similar input. Areas of interest include: Integration of symbolic and neural techniques for language and speech processing, reasoning and inferencing, data mining, integration for vision, language, multimedia; combining fuzzy/neuro techniques in engineering; exploratory research in emergent symbolic behavior based on neural networks, interpretation and explanation of neural networks, knowledge extraction from neural networks, interacting knowledge representations, dynamic systems and recurrent networks, evolutionary techniques for cognitive tasks (language, reasoning, etc), autonomous learning systems for cognitive agents that utilize both neural and symbolic learning techniques. For more information please see http://osiris.sunderland.ac.uk/~cs0stw/wermter/workshops/nips-workshop.html Workshop contact person: Professor Stefan Wermter Research Chair in Intelligent Systems University of Sunderland School of Computing & Information Systems St Peters Way Sunderland SR6 0DD United Kingdom phone: +44 191 515 3279 fax: +44 191 515 2781 email: stefan.wermter at sunderland.ac.uk http://osiris.sunderland.ac.uk/~cs0stw/ ---------------------------------------------------------------------- SIMPLE INFERENCE HEURISTICS VS. COMPLEX DECISION MACHINES http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/todd.html Organizers: Peter M. Todd, Laura Martignon, Kathryn Blackmond Laskey Participants and presentations are invited for this post-NIPS workshop on the contrast in both psychology and machine learning between a probabilistically- defined view of rational decision making with its apparent demand for complex Bayesian models, and a more performance-based view of rationality built on the use of simple, fast and frugal decision heuristics. ---------------------------------------------------------------------- CONTINUOUS LEARNING http://www.forwiss.uni-erlangen.de/aknn/cont-learn/ Organizers: Peter Protzel, Lars Kindermann, Achim Lewandowski, and Michael Tagscherer FORWISS and Chemnitz University of Technology, Germany By continuous learning we mean that learning takes place all the time and is not interrupted, that there is no difference between periods of training and operation, and that learning AND operation start with the first pattern. In this workshop, we will especially focus on the approximation of non-linear, time-varying functions. The goal is modeling and adapting the model to follow the changes of the underlying process, not merely forecasting the next output. In order to facilitate the comparison of the various methods, we provide different benchmark data sets and participants are encouraged to discuss their results on these benchmarks during the workshop. Further information: http://www.forwiss.uni-erlangen.de/aknn/cont-learn/ ---------------------------------------------------------------------- LEARNING CHIPS AND NEUROBOTS http://bach.ece.jhu.edu/nips98 Organizers: Gert Cauwenberghs, Johns Hopkins University Ralph Etienne-Cummings, Johns Hopkins University Marwan Jabri, Sydney University This workshop aims at a better understanding of how different approaches to learning and sensorimotor control, including algorithms and hardware, from backgrounds in neuromorphic VLSI, robotics, neural nets, AI, genetic programming etc. can be combined to create more intelligent systems interacting with their environment. We encourage active participation, and welcome live demonstrations of systems. The panel has a representation over a wide range of disciplines. Machine learning approaches include: reinforcement learning, TD-lambda (or predictive hebbian learning), Q-learning, and classical as well as operand conditioning. VLSI implementations cover some of these, integrated on-chip, plus the sensory and motor interfaces. Evolutionary approaches cover genetic techniques, applied to populations of robots. Finally, we have designers of microrobots and walking robots on the panel. This list is by no means exhaustive! More information can be found at URL: http://bach.ece.jhu.edu/nips98 __________________________________________________________________________ From aweigend at stern.nyu.edu Sat Aug 1 09:39:38 1998 From: aweigend at stern.nyu.edu (Andreas Weigend) Date: Sat, 1 Aug 1998 09:39:38 -0400 (EDT) Subject: Computational Finance Jan 6-8 1999 at NYU/Stern: CFP and Registration Message-ID: <199808011339.JAA05053@sabai.stern.nyu.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 8543 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/a15f186d/attachment-0001.ksh From Tom.Mitchell at cs.cmu.edu Tue Aug 4 18:06:46 1998 From: Tom.Mitchell at cs.cmu.edu (Tom Mitchell) Date: Tue, 04 Aug 1998 18:06:46 -0400 Subject: Proposed Center for Science of Learning Message-ID: A Proposed National Center for the Science of Learning Are you interested in helping create a new national center on the science of learning? Would you like to raise the priority of research in this area at the national level? Would you be interested in visiting such a Center to interact with researchers around the world studying learning? We (a few dozen faculty from Carnegie Mellon and University of Pittsburgh) seek your participation in an international center to accelerate research on all forms of learning. At present we are writing one of 44 final proposals that will compete for approximately 10 NSF Science and Technology Centers. To our knowledge, ours is the only proposed center with an emphasis on learning. The center mission is to work toward a general science of learning that spans multiple paradigms including Automated learning: computer learning algorithms, data mining, robot learning, ... Theoretical foundations: computational learning theory, statistics, information theory ... Biological learning: neurobiology, cognitive models, education, automated tutors,.. The Center, while based in Pittsburgh, intends to involve researchers and educators from around the nation and the world in a variety of ways. Therefore, we seek your advice and your participation. Please take a moment to indicate which of the following you would like to see as Center activities: 1. Sponsor several workshops each year on specific learning issues? Are you likely to propose a workshop / attend / send someone? (circle any that apply) Suggestions (including possible topics): 2. Sponsor visits by undergrads / grad students / postdocs / faculty for a few days / summer / a year to participate in Center courses / research / other ? Are you potentially interested in a sponsored visit / sending someone to visit? Suggestions? 3. Sponsor a community-wide web repository of relevant research papers / commentary / standard data sets / software / online demos / course materials / other (working with current services such as the UCI repository, STATLIB, etc.)? Are you potentially interested in contributing to / using / helping manage the repository? Suggestions? 4. Sponsor a summer school taught by instructors from multiple institutions? Or courses accessible via the web to students worldwide, and taught by instructors worldwide via video network? Are you potentially interested in helping teach a course / taking a course? Suggestions? 5. Support our research community in other ways? Please suggest how: 6. Please classify yourself as: faculty / student / researcher / practitioner / manager / other academia / industry / government / other If you would like to be on the mailing list for Center discussions, please include your email address: You may return responses by email to learn at cs.cmu.edu. For more information on the center, including excerpts from our pre-proposal to NSF, see http://www.cs.cmu.edu/~tom/stc.html, or contact one of the co-PI's Stephen.Fienberg at cmu.edu, Kevin Ashley (ashley+pitt.edu), Jay.McClelland at cmu.edu, and Tom.Mitchell at cmu.edu. From aonishi at bpe.es.osaka-u.ac.jp Thu Aug 6 08:33:25 1998 From: aonishi at bpe.es.osaka-u.ac.jp (Toru Aonishi) Date: Thu, 6 Aug 1998 21:33:25 +0900 Subject: Paper available Message-ID: <199808061233.VAA09932@fsunj.bpe.es.osaka-u.ac.jp> Dear Colleagues, The following paper is available in postscript form from ftp://ftp.bpe.es.osaka-u.ac.jp/pub/FukushimaLab/Papers/aonishi/paper_prl.tar.gz Thanks, Toru Aonishi aonishi at bpe.es.osaka-u.ac.jp ----------------- Title: A statistical mechanics of an oscillator associative memory with scattered natural frequencies Authors: Toru Aonishi, Koji Kurata and Masato Okada cond-mat/9808059 Abstract: We analyze an oscillator associative memory with scattered natural frequencies in memory retrieval states by a statistical mechanical method based on the SCSNA and the Sakaguchi-Kuramoto theory. The system with infinite stored patterns has a frustration on the synaptic weights. In addition, it is numerically shown that almost all oscillators synchronize under memory retrieval, but desynchronize under spurious memory retrieval, when setting optimal parameters. Thus, it is possible to determine whether the recalling process is successful or not using information about the synchrony/asynchrony. The solvable toy model presented in this paper may be a good candidate for showing the validity of the synchronized population coding in the brain proposed by Phillips and Singer. From ajain at iconixpharm.com Thu Aug 6 16:53:45 1998 From: ajain at iconixpharm.com (Ajay Jain) Date: 06 Aug 98 13:53:45 -0700 Subject: Positions available: Iconix Pharmaceuticals Message-ID: ***** PLEASE FORWARD TO RELEVANT MAILING LISTS ***** I am seeking one or two individuals for PhD-level scientist positions at Iconix Pharmaceuticals. People with applied neural network experience are particularly of interest (see details below). Iconix is a San Francisco Bay Area biopharmaceutical company that is establishing the new area of Chemical Genomics. The company's approach aims to advance the discovery process for human drugs through the systematic acquisition, integration, and analysis of genetic and chemical information. We are seeking people to develop and implement new algorithms for interpreting large, complex data sets generated from the interaction of many small molecules with numerous gene products. Ideal candidates will have a PhD in computer science and demonstrated success in applying sophisticated computation to real-world problems (e.g. drug discovery, computational biology, object recognition, robotics, etc.). Experience in machine-learning, computational geometry, or physical modeling is beneficial, as is formal training in chemistry, biology, or physics. Experience in applying neural network techniques to problems involving noisy data is particularly relevant. As the member of an interdisciplinary team, you will develop, implement, and apply novel algorithms to the interpretation of chemical and biological information. Candidates for these positions will have excellent communication skills, excellent work history and references, and the ability to work both independently and as part of a multidisciplinary team of scientists, including colleagues from biology, chemistry, genetics, and other fields. We offer competitive salaries, stock options, and excellent benefits. Please send or fax your resume to: Human Resources Dept., Iconix Pharmaceuticals, Inc., 850 Maude Avenue, Mountain View, CA 94043. Attn: Chemical Information Sciences Group. FAX (650) 526-3034, EMAIL hr at iconixpharm.com. Iconix Pharmaceuticals, Inc., is an Equal Opportunity Employer. The type of research involved is exemplified by the following papers: J. Ruppert, W. Welch, and A. N. Jain. Automatic Characterization and Identification of Protein Binding Pockets for Molecular Docking. Protein Science 6: 524-533, 1996. A. N. Jain. Scoring Non-Covalent Ligand-Protein Interactions: A Continuous Differentiable Function Tuned to Compute Binding Affinities. Journal of Computer-Aided Molecular Design. 10:5, 427-440, 1996. W. Welch, J. Ruppert, and A. N. Jain. Hammerhead: Fast, Fully Automated Docking of Flexible Ligands to Protein Binding Sites. Chemistry and Biology 3: 449-462, 1996. A. N. Jain, N. L. Harris, and J. Y. Park. Quantitative Binding Site Model Generation: Compass Applied to Multiple Chemotypes Targeting the 5HTlA Receptor. Journal of Medicinal Chemistry 38: 1295-1307, 1995. A. N. Jain, T. G. Dietterich, R. L. Lathrop, D. Chapman, R. E. Critchlow, B. E. Bauer, T. A. Webster, and T. Lozano-Perez. Compass: A Shape-Based Machine Learning Tool for Drug Design. Journal of Computer-Aided Molecular Design 8(6): 635-652, 1994. A. N. Jain, K. Koile, D. Chapman. Compass: Predicting Biological Activities from Molecular Surface Properties; Performance Comparisons on a Steroid Benchmark. Journal of Medicinal Chemistry 37: 2315-2327, 1994. T. G. Dietterich, A. N. Jain, R. L. Lathrop, and T. Lozano-Perez. A Comparison of Dynamic Reposing and Tangent Distance for Drug Activity Prediction. In Advances in Neural information Processing Systems 6, ed. J. D. Cowan, G. Tesauro, and J. Alspector. San Francisco, CA: Morgan Kaufmann. 1994. -------------------------------------------------------------------- Dr. Ajay N. Jain Associate Director, Chemical Information Sciences 850 Maude Ave. Iconix Pharmaceuticals Mountain View, CA 94043 ajain at iconixpharm.com From cindy at cns.bu.edu Thu Aug 6 14:43:12 1998 From: cindy at cns.bu.edu (Cynthia Bradford) Date: Thu, 6 Aug 1998 14:43:12 -0400 Subject: Boston University: Cognitive and Neural Systems 1999 Meeting Message-ID: <199808061843.OAA28727@retina.bu.edu> *****CALL FOR PAPERS***** THIRD INTERNATIONAL CONFERENCE ON COGNITIVE AND NEURAL SYSTEMS May 26-29, 1999 Sponsored by Boston University's Center for Adaptive Systems and Department of Cognitive and Neural Systems with financial support from DARPA and ONR How Does the Brain Control Behavior? How Can Technology Emulate Biological Intelligence? The conference will include invited tutorials and lectures, and contributed lectures and posters by experts on the biology and technology of how the brain and other intelligent systems adapt to a changing world. The conference is aimed at researchers and students of computational neuroscience, connectionist cognitive science, artificial neural networks, neuromorphic engineering, and artificial intelligence. A single oral or poster session enables all presented work to be highly visible. Abstract submissions encourage submissions of the latest results. Costs are kept at a minimum without compromising the quality of meeting handouts and social events. CALL FOR ABSTRACTS Session Topics: * vision * spatial mapping and navigation * object recognition * neural circuit models * image understanding * neural system models * audition * mathematics of neural systems * speech and language * robotics * unsupervised learning * hybrid systems (fuzzy, evolutionary, digital) * supervised learning * neuromorphic VLSI * reinforcement and emotion * industrial applications * sensory-motor control * other * cognition, planning, and attention Contributed Abstracts must be received, in English, by January 29, 1999. Notification of acceptance will be given by February 28, 1999. A meeting registration fee of $45 for regular attendees and $30 for students must accompany each Abstract. See Registration Information for details. The fee will be returned if the Abstract is not accepted for presentation and publication in the meeting proceedings. Registration fees of accepted abstracts will be returned on request only until April 15, 1999. Each Abstract should fit on one 8.5" x 11" white page with 1" margins on all sides, single-column format, single-spaced, Times Roman or similar font of 10 points or larger, printed on one side of the page only. Fax submissions will not be accepted. Abstract title, author name(s), affiliation(s), mailing, and email address(es) should begin each Abstract. An accompanying cover letter should include: Full title of Abstract; corresponding author and presenting author name, address, telephone, fax, and email address; and a first and second choice from among the topics above, including whether it is biological (B) or technological (T) work. Example: first choice: vision (T); second choice: neural system models (B). (Talks will be 15 minutes long. Posters will be up for a full day. Overhead, slide, and VCR facilities will be available for talks.) Abstracts which do not meet these requirements or which are submitted with insufficient funds will be returned. Accepted Abstracts will be printed in the conference proceedings volume. No longer paper will be required. The original and 3 copies of each Abstract should be sent to: Cynthia Bradford, Boston University, Department of Cognitive and Neural Systems, 677 Beacon Street, Boston, MA 02215. REGISTRATION INFORMATION: Early registration is recommended. To register, please fill out the registration form below. Student registrations must be accompanied by a letter of verification from a department chairperson or faculty/research advisor. If accompanied by an Abstract or if paying by check, mail to the address above. If paying by credit card, mail as above, or fax to (617) 353-7755, or email to cindy at cns.bu.edu. The registration fee will help to pay for a reception, 6 coffee breaks, and the meeting proceedings. STUDENT FELLOWSHIPS: Fellowships for PhD candidates and postdoctoral fellows are available to cover meeting travel and living costs. The deadline to apply for fellowship support is January 29, 1999. Applicants will be notified by February 28, 1999. Each application should include the applicant's CV, including name; mailing address; email address; current student status; faculty or PhD research advisor's name, address, and email address; relevant courses and other educational data; and a list of research articles. A letter from the listed faculty or PhD advisor on official institutional stationery should accompany the application and summarize how the candidate may benefit from the meeting. Students who also submit an Abstract need to include the registration fee with their Abstract. Reimbursement checks will be distributed after the meeting. REGISTRATION FORM Third International Conference on Cognitive and Neural Systems Department of Cognitive and Neural Systems Boston University 677 Beacon Street Boston, Massachusetts 02215 Tutorials: May 26, 1999 Meeting: May 27-29, 1999 FAX: (617) 353-7755 (Please Type or Print) Mr/Ms/Dr/Prof: _____________________________________________________ Name: ______________________________________________________________ Affiliation: _______________________________________________________ Address: ___________________________________________________________ City, State, Postal Code: __________________________________________ Phone and Fax: _____________________________________________________ Email: _____________________________________________________________ The conference registration fee includes the meeting program, reception, two coffee breaks each day, and meeting proceedings. The tutorial registration fee includes tutorial notes and two coffee breaks. CHECK ONE: ( ) $70 Conference plus Tutorial (Regular) ( ) $45 Conference plus Tutorial (Student) ( ) $45 Conference Only (Regular) ( ) $30 Conference Only (Student) ( ) $25 Tutorial Only (Regular) ( ) $15 Tutorial Only (Student) METHOD OF PAYMENT (please fax or mail): [ ] Enclosed is a check made payable to "Boston University". Checks must be made payable in US dollars and issued by a US correspondent bank. Each registrant is responsible for any and all bank charges. [ ] I wish to pay my fees by credit card (MasterCard, Visa, or Discover Card only). Name as it appears on the card: _____________________________________ Type of card: _______________________________________________________ Account number: _____________________________________________________ Expiration date: ____________________________________________________ Signature: __________________________________________________________ From giro at open.brain.riken.go.jp Fri Aug 7 02:27:37 1998 From: giro at open.brain.riken.go.jp (Dr. Mark Girolami) Date: Fri, 07 Aug 1998 15:27:37 +0900 Subject: Two ICA Papers Available Message-ID: <35CA9E59.5929@open.brain.riken.go.jp> Dear Colleagues, The following papers are available in gzipped postscript form at the following website http://www-cis.paisley.ac.uk/scripts/staff.pl//giro-ci0/index.html Many Thanks Best Rgds Mark Girolami -------------------------------------------------------------------- Title: An Alternative Perspective on Adaptive Independent Component Analysis Algorithms Author: Mark Girolami Publication: Neural Computation, Vol.10, No.8, pp 2103-2114, 1998. Abstract: This paper develops an extended independent component analysis algorithm for mixtures of arbitrary sub-Gaussian and super-Gaussian sources. The Gaussian mixture model of Pearson is employed in deriving a closed-form generic score function for strictly sub-Gaussian sources. This is combined with the score function for a uni-modal super-Gaussian density to provide a computationally simple yet powerful algorithm for performing independent component analysis on arbitrary mixtures of non-Gaussian sources. ------------------------------------- Title: A Common Neural Network Model for Unsupervised Exploratory Data Analysis and Independent Component Analysis Authors: Mark Girolami, Andrzej Cichocki, and Shun-Ichi Amari Publication: I.E.E.E Transactions on Neural Networks, To Appear. Abstract: This paper presents the derivation of an unsupervised learning algorithm, which enables the identification and visualisation of latent structure within ensembles of high dimensional data. This provides a linear projection of the data onto a lower dimensional subspace to identify the characteristic structure of the observations independent latent causes. The algorithm is shown to be a very promising tool for unsupervised exploratory data analysis and data visualisation. Experimental results confirm the attractiveness of this technique for exploratory data analysis and an empirical comparison is made with the recently proposed Generative Topographic Mapping (GTM) and standard principal component analysis (PCA). Based on standard probability density models a generic nonlinearity is developed which allows both; 1) identification and visualisation of dichotomised clusters inherent in the observed data and, 2) separation of sources with arbitrary distributions from mixtures, whose dimensionality may be greater than that of number of sources. The resulting algorithm is therefore also a generalised neural approach to independent component analysis (ICA) and it is considered to be a promising method for analysis of real world data that will consist of sub and super-Gaussian components such as biomedical signals. -- ---------------------------------------------- Dr. Mark Girolami (TM) RIKEN, Brain Science Institute Laboratory for Open Information Systems 2-1 Hirosawa, Wako-shi, Saitama 351-01, Japan Email: giro at open.brain.riken.go.jp Tel: +81 48 467 9666 Tel: +81 48 462 3769 (apartment) Fax: +81 48 467 9694 --------------------------------------------- Currently on Secondment From: Department of Computing and Information Systems University of Paisley High Street, PA1 2BE Scotland, UK Email: giro0ci at paisley.ac.uk Tel: +44 141 848 3963 Fax: +44 141 848 3542 Secretary: Mrs E Campbell Tel: +44 141 848 3966 --------------------------------------------- From sshams at biodiscovery.com Fri Aug 7 19:05:23 1998 From: sshams at biodiscovery.com (Soheil Shams) Date: Fri, 07 Aug 1998 16:05:23 -0700 Subject: Position: Image Processing Scientist Message-ID: <35CB8833.CD397DEA@biodiscovery.com> We are looking for a talented, creative individual with a strong background in image processing, machine vision, neural networks, or pattern recognition. This position involves development and implementation of machine vision and image processing algorithms for a number of ongoing and planned projects in high-throughput genetic expression analysis. The position requires the ability to formulate problem descriptions through interaction with end-user customers. These technical issues must then be transformed into innovative and practical algorithmic solutions. We expect our scientists to have outstanding written/oral communication skills and encourage publications in scientific journals. Requirements: A Ph.D. in Electrical Engineering, Computer Science, Physics, or related field, or equivalent experience is required. Knowledge of biology and genetics is a plus but not necessary. Two years experience with MatLab? and at least 5 years programming experience is required. Commercial software development experience is a plus. For consideration, please send your r?sum? along with a cover letter to E-mail: HR at BioDiscovery.com Fax: (310) 966-9346 Snail-mail: BioDiscovery, Inc. 11150 W. Olympic Blvd. Suite 805E Los Angeles, CA 90064 Please include pointers to your work online if applicable. BioDiscovery, Inc. is an early-stage start-up company dedicated to the development of state-of-the-art bioinformatics tools for molecular biology and genomics research applications. We are a leading gene expression image and data analysis firm with an outstanding client list and a progressive industry stance. We are rapidly growing and are looking for talented individuals with experience and motivations to take on significant responsibility and deliver with minimal supervision. BioDiscovery is an equal opportunity employer and our shop has a friendly, intense atmosphere. We are headquartered in sunny Southern California close to the UCLA campus. From emilosam at ubbg.etf.bg.ac.yu Sat Aug 8 01:23:32 1998 From: emilosam at ubbg.etf.bg.ac.yu (Milosavljevic Milan) Date: Sat, 8 Aug 1998 07:23:32 +0200 Subject: Paper in Electronic Letters Message-ID: <01bdc28c$aedbeae0$LocalHost@milan> Dear Connectionist, The following paper will be published at Electonic Letters, No.16, ( August 1998). Printed copy can be obtain upon request sendin e-mail to l.cander at rl.ac.uk or emilosam at ubbg.etf.bg.ac.yu ---------------------------------------------------------------------- Title: Ionospheric forecasting technique by artificial neural network Authors: Ljijana Cander (1), Milan Milosavljevic(2,3), Srdjan Stankovic(2), Sasa Tomasevic (2) (1) Rutherford Appleton Laboratory, Radio Communications Research Unit Chilton, Didcot, Oxon OX11 0QX, UK (2) Faculty of Electrical Engineering, Belgrade University, Yugoslavia (3) Institute for Applied Mathematics and Electronics, Belgrade, Yugoslavia Abstract: Artificial neural network method is applied to the development of ionospheric forecasting technique for one hour ahead. Comparison between observed and predicted values of the critical frequency of F2 layer, foF2, and the total electron content, TEC, are presented to show the appropriatness of the proposed technique. Best Regards, ------------------------------------------------------- Prof. Milan Milosavljevic Faculty of Electrical Engineering University of Belgrade Bulevar Revolucije 73 11000 Belgrade, Yugoslavia tel: (381 11 ) 324 84 64 fax: (381 11 ) 324 86 81 ----------------------------------------------------- Home address: Narodnih Heroja 20/32 11 000 Belgrade, Yugoslavia -------------------------------------------------------- tel./fax (home): (381 11) 672 616 e-mail: emilosam at ubbg.etf.bg.ac.yu From radford at cs.toronto.edu Sat Aug 8 17:06:47 1998 From: radford at cs.toronto.edu (Radford Neal) Date: Sat, 8 Aug 1998 17:06:47 -0400 Subject: Software for Bayesian modeling Message-ID: <98Aug8.170654edt.345@neuron.ai.toronto.edu> FREE SOFTWARE FOR BAYESIAN MODELING USING MARKOV CHAIN MONTE CARLO A new release of my software for flexible Bayesian modeling is available. This software is meant to support research and education regarding: * Flexible Bayesian models for regression and classification based on neural networks and Gaussian processes, and for probability density estimation using mixtures. Neural net training using early stopping is also supported. * Markov chain Monte Carlo methods, and their applications to Bayesian modeling, including implementations of Metropolis, hybrid Monte Carlo, slice sampling, and tempering methods. The neural network, Gaussian process, and mixture model facilties are updated from the previous version, with some new facilities, and some bugs fixed. New programs are provided for applying a variety of Markov chain methods to distributions that are defined using simple formulas. A BUGS-like notation can be used to define some Bayesian models. Note, however, that I have not attempted to provide comprehensive facilities for Bayesian modeling. The intent is more to demonstrate the Markov chain methods. One facility that may be of particular interest is that Annealed Importance Sampling can be used to find the marginal likelihood for a Bayesian model (though obtaining good results may require some fiddling). The software is written in C for use on Unix systems. It is free for research and educational purposes. You can get it from my web page. More directly, you can go to http://www.cs.utoronto.ca/~radford/fbm.software.html and follow the instruction there. Or you can just browse the documentation there to find out more about the software. There are also links to various papers that you might want to read before trying to use the software. Please let me know if you have any problems with obtaining or installing the software, or if you have any other comments. Radford Neal ---------------------------------------------------------------------------- Radford M. Neal radford at cs.utoronto.ca Dept. of Statistics and Dept. of Computer Science radford at utstat.utoronto.ca University of Toronto http://www.cs.utoronto.ca/~radford ---------------------------------------------------------------------------- From skoenig at cc.gatech.edu Mon Aug 10 14:28:06 1998 From: skoenig at cc.gatech.edu (Sven Koenig) Date: Mon, 10 Aug 1998 14:28:06 -0400 (EDT) Subject: CFP: AAAI Spring Symposium Message-ID: <199808101828.OAA08684@green.cc.gatech.edu> A non-text attachment was scrubbed... Name: not available Type: text Size: 4212 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/85df502b/attachment-0001.ksh From takagi at artemis.ad.kyushu-id.ac.jp Mon Aug 10 07:03:16 1998 From: takagi at artemis.ad.kyushu-id.ac.jp (takagi) Date: Mon, 10 Aug 1998 20:03:16 +0900 Subject: abstract of IIZUKA'98 papers on WEB Message-ID: <199808101103.UAA11835@artemis.ad.kyushu-id.ac.jp> A non-text attachment was scrubbed... Name: not available Type: text Size: 797 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/bf888fca/attachment-0001.ksh From Dave_Touretzky at cs.cmu.edu Tue Aug 11 03:34:27 1998 From: Dave_Touretzky at cs.cmu.edu (Dave_Touretzky@cs.cmu.edu) Date: Tue, 11 Aug 1998 03:34:27 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <23040.902820867@skinner.boltz.cs.cmu.edu> I'd like to start a debate on the current state of connectionist symbol processing? Is it dead? Or does progress continue? A few years ago I contributed a short article on "Connectionist and symbolic representations" to Michael Arbib's Handbook of Brain Theory and Neural Networks (MIT Press, 1995). In that article I explained concepts such as coarse coding, distributed representations, and RAAMs, and how people had managed to do elementary kinds of symbol processing tasks in this framework. The problem, though, was that we did not have good techniques for dealing with structured information in distributed form, or for doing tasks that require variable binding. While it is possible to do these things with a connectionist network, the result is a complex kludge that, at best, sort of works for small problems, but offers no distinct advantages over a purely symbolic implementation. The cases where people had shown interesting generalization behavior in connectionist nets involved simple vector-based representations, without nested structures or variable binding. People had gotten some interesting effects with localist networks, by doing spreading activation and a simple form of constraint satisfaction. A good example is the spreading activation models of word disambiguation developed in the 1980s by Jordan Pollack and Dave Walts, and by Gary Cottrell. But the heuristic reasoning enabled by spreading activation models is extremely limited. This approach does not create new structure on the fly, or deal with structured representations or variable binding. Those localist networks that did attempt to implement variable binding did so in a discrete, symbolic way that did not advance the parallel constraint satisfaction/heuristic reasoning agenda of earlier spreading activation research. So I concluded that connectionist symbol processing had reached a plateau, and further progress would have to await some revolutionary new insight about representations. The last really significant work in the area was, in my opinion, Tony Plate's holographic reduced representations, which offered a glimpse of how structured information might be plausibly manipulated in distributed form. (Tony received an IJCAI-91 best paper award for this work. For some reason, the journal version did not appear until 1995.) But further incremental progress did not seem possible. People still do cognitive modeling using connectionist networks. And there is some good work out there. One of my favorite examples is David Plaut's use of attractor neural networks to model deep and surface dyslexia -- an area pioneered by Geoff Hinton and Tim Shallice. But like most connectionist cognitive models, it relies on a simple feature vector representation. The problems of structured representations and variable binding have remained unsolved. No one is trying to build distributed connectionist reasoning systems any more, like the connectionist production system I built with Geoff Hinton, or Mark Derthick's microKLONE. Today, Michael Arbib is working on the second edition of his handbook, and I've been asked to update my article on connectionist symbol processing. Is it time to write an obituary for a research path that expired because the problems were too hard for the tools available? Or are there important new developments to report? I'd love to hear some good news. -- Dave Touretzky References: Arbib, M. A. (ed) (1995) Handbook of Brain Theory and Neural Networks. Cambridge, MA: MIT Press. Plaut, D. C., McClelland, J. L., Seidenberg, M. S., and Patterson, K. (1996) Understandig normal and impaired word reading: computational principles in quasi-regular domains. Psychological Review, 103(1):56-115. Plate, T. A. (1995) Holographic reduced representations. IEEE Transactions on Neural Networks, 6(3):623. Touretzky, D. S. and Hinton, G. E. (1988) A distributed connectionist production system. Cognitive Science, vol. 12, number 3, pp. 423-466. Touretzky, D. S. (1995) Connectionist and symbolic representations. In M. A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, pp. 243-247. MIT Press. From duch at MPA-Garching.MPG.DE Tue Aug 11 12:25:36 1998 From: duch at MPA-Garching.MPG.DE (Wlodek Duch) Date: Tue, 11 Aug 1998 18:25:36 +0200 Subject: EANN'99 first call for papers Message-ID: <35D07080.31DF@mpa-garching.mpg.de> EANN '99 Fifth International Conference on Engineering Applications of Neural Networks Warsaw, Poland 13-15 September 1999 WWW page: http://www.phys.uni.torun.pl/eann99/ First Call for Papers The conference is a forum for presenting the latest results on neural network applications in technical fields. The applications may be in any engineering or technical field, including but not limited to: systems engineering, mechanical engineering, robotics, process engineering, metallurgy, pulp and paper technology, aeronautical engineering, computer science, machine vision, chemistry, chemical engineering, physics, electrical engineering, electronics, civil engineering, geophysical sciences, biomedical systems, and environmental engineering. Summaries of two pages (about 1000 words) should be sent by e-mail to eann99 at phys.uni.torun.pl by 15 February 1999 in plain text, Tex or LaTeX format. Please mention two to four keywords. Submissions will be reviewed. For information on earlier EANN conferences see the WWW http://www.abo.fi/~abulsari/EANN98.html Notification of acceptance will be sent around 15 March. The final papers will be expected by 15 April. All papers will be upto 6 pages in length. Authors are expected to register by 15 April. Authors are encouraged to send the abstracts to the conference address instead of the organisers of the special tracks. Organising committee R. Baratti, University of Cagliari, Italy L. Bobrowski, Polish Academy of Science, Poland A. Bulsari, Nonlinear Solutions Oy, Finland W. Duch, Nicholas Copernicus University, Poland J. Fernandez de Canete, University of Malaga, Spain A. Ruano, University of Algarve, Portugal D. Tsaptsinos, Kingston University, UK National program committee L. Bobrowski, IBIB Warsaw A. Cichocki, RIKEN, Japan W. Duch, Nicholas Copernicus University T. Kaczorek, Warsaw Polytechnic J. Korbicz, Technical University of Zielona Gora L. Rutkowski, Czestochowa Polytechnic R. Tadeusiewicz, Academy of Mining and Metallurgy, Krakow Z. Waszczyszyn, Krakow Polytechnic International program committee (to be confirmed, extended) S. Cho, Pohang University of Science and Technology, Korea T. Clarkson, King's College, UK S. Draghici, Wayne State University, USA G. Forsgr?n, Stora Corporate Research, Sweden I. Grabec, University of Ljubljana, Slovenia A. Iwata, Nagoya Institute of Technology, Japan C. Kuroda, Tokyo Institute of Technology, Japan H. Liljenstr?m, Royal Institute of Technology, Sweden L. Ludwig, University of Tubingen, Germany M. Mohammadian, Monash University, Australia P. Myllykoski, Helsinki University of Technology, Finland A. Owens, DuPont, USA R. Parenti, Ansaldo Ricerche, Italy F. Sandoval, University of Malaga, Spain C. Schizas, University of Cyprus, Cyprus E. Tulunay, Middle East Technical University, Turkey S. Usui, Toyohashi University of Technology, Japan P. Zufiria, Polytechnic University of Madrid, Spain Electronic mail is not absolutely reliable, so if you have not heard from the conference secretariat after sending your abstract, please contact again. You should receive an abstract number in a couple of days after the submission. From roitblat at hawaii.edu Tue Aug 11 13:31:10 1998 From: roitblat at hawaii.edu (Herbert L. Roitblat) Date: Tue, 11 Aug 1998 07:31:10 -1000 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: I don't have answers, but I might add to the bibliography. Part of the problem with symbol processing approaches is that they are not as truly successful as they might pretend. The syntactic part of symbol processing is easy and reliable, but the semantic part is still a muddle. In English, for example, the words do not really have a systematic meaning as Fodor and others would assert. I have my own ideas on the matter, but here are a couple of references on the topic. Aizawa, K. (1997) Explaining systematicity. Mind and Language, 12, 115-136. Mathews, R. J. (1997) Can connectionists explain systematicity? Mind and Language, 12, 154-177. On Mon, 10 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? > .. > > People still do cognitive modeling using connectionist networks. And > there is some good work out there. One of my favorite examples is > David Plaut's use of attractor neural networks to model deep and > surface dyslexia -- an area pioneered by Geoff Hinton and Tim > Shallice. But like most connectionist cognitive models, it relies on > a simple feature vector representation. The problems of structured > representations and variable binding have remained unsolved. No one > is trying to build distributed connectionist reasoning systems any > more, like the connectionist production system I built with Geoff > Hinton, or Mark Derthick's microKLONE. > > Today, Michael Arbib is working on the second edition of his handbook, > and I've been asked to update my article on connectionist symbol > processing. Is it time to write an obituary for a research path that > expired because the problems were too hard for the tools available? > Or are there important new developments to report? > > I'd love to hear some good news. > > -- Dave Touretzky > > > References: > > Arbib, M. A. (ed) (1995) Handbook of Brain Theory and Neural Networks. > Cambridge, MA: MIT Press. > > Plaut, D. C., McClelland, J. L., Seidenberg, M. S., and Patterson, > K. (1996) Understandig normal and impaired word reading: computational > principles in quasi-regular domains. Psychological Review, > 103(1):56-115. > > Plate, T. A. (1995) Holographic reduced representations. IEEE > Transactions on Neural Networks, 6(3):623. > > Touretzky, D. S. and Hinton, G. E. (1988) A distributed connectionist > production system. Cognitive Science, vol. 12, number 3, pp. 423-466. > > Touretzky, D. S. (1995) Connectionist and symbolic representations. > In M. A. Arbib (ed.), Handbook of Brain Theory and Neural Networks, > pp. 243-247. MIT Press. > Herbert Roitblat, Ph.D. Professor of Psychology roitblat at hawaii.edu University of Hawaii (808) 956-6727 (808) 956-4700 fax 2430 Campus Road, Honolulu, HI 96822 USA From jose at tractatus.rutgers.edu Tue Aug 11 16:09:48 1998 From: jose at tractatus.rutgers.edu (Stephen Jose Hanson) Date: Tue, 11 Aug 1998 16:09:48 -0400 Subject: POSTDOC LINE FOR FALL 98 Message-ID: <35D0A50C.36703D4@tractatus.rutgers.edu> The DEPARTMENT OF PSYCHOLOGY at RUTGERS UNIVERSITY-Newark Campus - COGNITIVE NEUROSCIENC POSTDOCTORAL Position A postdoctoral position that can be filled immediately running through Fall98/Spring99 with a possibility of a second year renewal (starting date Flexible). Area of specialization in connectionist modeling with applications to recurrent networks, image processing and cognitive neuroscience (functional imaging). Review of applications will begin immediately-but will continue to be accepted until the position is filled. Starting date is flexible in the Summer 98 time frame. Rutgers University is an equal opportunity/affirmative action employer. Qualified women and minority candidates are especially encouraged to apply. Send CV to Professor S. J. Hanson, Chair, Department of Psychology - Post Doc Search, Rutgers University, Newark, NJ 07102. Email enquiry's can be made to jose at psychology.rutgers.edu please include "POSTDOC" in the subject heading. From shastri at ICSI.Berkeley.EDU Tue Aug 11 21:15:57 1998 From: shastri at ICSI.Berkeley.EDU (Lokendra Shastri) Date: Tue, 11 Aug 1998 18:15:57 PDT Subject: Connectionist symbol processing: any progress? Message-ID: <199808120115.SAA16013@lassi.ICSI.Berkeley.EDU> Dear Connectionists, I am happy to report that work on connectionist symbol processing is not only alive and kicking, but also in robust and vigourous health. Are we a breadth away from answering all the hard questions about representation and learning? No. But are we making sufficient progress to warrant optimism? Certainly yes! A debate might be fun, but it seems better to start by pointing out a few other examples of relevant work in the field. This should enable those who have not followed the developments to judge for themselves. Here are some URLs (with many references) http://www.icsi.berkeley.edu/NTL/ http://www.icsi.berkeley.edu/~shastri/shrutibiblio.html http://cs.ua.edu/~rsun http://www.dcs.ex.ac.uk/~jamie/ http://www.psych.ucla.edu/Faculty/Hummel/ and here are a few references for readers with limited web access: Embodied Lexical Development, Proceedings of the Nineteenth Annual Meeting of the Cognitive Science Society COGSCI-97, Aug 9-11, Stanford: Stanford University Press, 1997. Bailey, D., J. Feldman, S. Narayanan, G. Lakoff (1997). Connectionist Syntactic Parsing Using Temporal Variable Binding. J. Henderson. Journal of Psycholinguistic Research, 23(5):353--379, 1994. The Human Semantic Potential: Spatial Language and Constrained Connectionism, T. Regier, Cambridge, MA: MIT Press. 1996. From paolo at uow.edu.au Tue Aug 11 20:58:30 1998 From: paolo at uow.edu.au (Paolo Frasconi) Date: Wed, 12 Aug 1998 10:58:30 +1000 (EST) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Adaptive techniques for dealing with structured information have recently emerged. In particular, algorithms and architectures for learning directed ordered acyclic graphs (DOAGs) are available and they have been shown to be effective in some application domains such as automated reasoning and pattern recognition. The basic idea is an extension of recurrent neural networks from sequences (which can be seen as a very special case of graphs having a linear-chain shape) to graphs. A generalization of backpropagation through time is available for acyclic graphs. Models for data structures can be conveniently represented in a graphical formalism which makes it simple to understand them as special cases of belief networks. There is still quite a lot of work to be done in this area and the interest is expanding. Last year we had a NIPS workshop on adaptive processing of data structures. Further details and links to papers can be found in the web page http://www.dsi.unifi.it/~paolo/datas (also mirrowed at http://www.uow.edu.au/~paolo/datas). Paolo Frasconi paolo at uow.edu.au Marco Gori marco at ing.unisi.it Alessandro Sperduti perso at di.unipi.it Paolo Frasconi Visiting Lecturer Faculty of Informatics University of Wollongong Phone: +61 2 4221 3121 (office) Northfields Avenue +61 2 4226 4925 (home) Wollongong NSW 2522 Fax: +61 2 4221 4843 AUSTRALIA http://www.uow.edu.au/~paolo From kmarg at uom.gr Wed Aug 12 04:30:35 1998 From: kmarg at uom.gr (Kostas Margaritis) Date: Wed, 12 Aug 1998 11:30:35 +0300 (EET DST) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Cognitive Maps (and Fuzzy Cognitive Maps) is another aspect of Graphical Belief Models which can be viewed as Recurrent Neural Networks leading to concepts relevant to Cognitive or Decision Dynamics. The original graphical cognitive model can me manipulated either as a directed graph (i.e. cycles, centrality etc) or as a dynamical system leading to behaviours similar to the dynamics of recurrent neural networks. Kostas Margaritis Dept. of Informatics Univ. of Macedonia Thessaloniki, Greece e-mail kmarg at uom.gr From r.gayler at psych.unimelb.edu.au Wed Aug 12 08:19:20 1998 From: r.gayler at psych.unimelb.edu.au (Ross Gayler) Date: Wed, 12 Aug 1998 22:19:20 +1000 Subject: Connectionist symbol processing: any progress? Message-ID: <3.0.32.19980812221919.006a0be0@myriad.unimelb.edu.au> At 03:34 11/08/98 -0400, you (Dave_Touretzky at cs.cmu.edu) wrote: >I'd like to start a debate on the current state of connectionist >symbol processing? Is it dead? Or does progress continue? > ... >So I concluded that connectionist symbol processing had reached a >plateau, and further progress would have to await some revolutionary >new insight about representations. The last really significant work >in the area was, in my opinion, Tony Plate's holographic reduced >representations, which offered a glimpse of how structured information >might be plausibly manipulated in distributed form. > ... >The problems of structured >representations and variable binding have remained unsolved. No one >is trying to build distributed connectionist reasoning systems any >more, like the connectionist production system I built with Geoff >Hinton, or Mark Derthick's microKLONE. Tony Plate is still alive and kicking. There was a small group of like-minded researchers present at the Analogy'98 workshop in Sofia. They would (and did) argue that methods related to Tony's HRRs do solve the problems of structured representations and variable binding. (The real problem is not how to represent structures but how to use those structures to drive cognitively useful operations.) The fact that these people turned up at an analogy workshop follows from a belief that analogy is the major mode of (connectionist) reasoning. The Sofia papers most related to HRRs are: Tony Plate Structured operations with distributed vector operations available from Tony's home page: http://www.mcs.vuw.ac.nz/~tap/ Pentti Kanerva Dual role of analogy in the design of a cognitive computer related older papers are available from: http://www.sics.se/nnrc/spc.html Ross Gayler & Roger Wales (note blatant self-promotion) Connections, binding, unification and analogical promiscuity Multiplicative binding, representation operators, and analogy both available from the computer science section of: http://cogprints.soton.ac.uk/ Other (non-HRR) connectionist analogy work from Sofia included: Keith Holyoak & John Hummel Analogy in a physical symbol system Graeme Halford , William Wilson & Steven Phillips Relational processing in higher cognition Julie McCredden NetAB: A neural network model of analogy by discovery Also, Chris Eliasmith (who was not at the Sofia workshop) has done some work re-implementing the ACME model of analogical mapping to be based on HRR's Check out his unpublications at: http://ascc.artsci.wustl.edu/~celiasmi/ Finally, anyone who wants to know about the Sofia workshop should check: http://www-psych.stanford.edu/~kalina/cogsci98/analogy_workshop.html and copies of the proceedings may be ordered from: analogy at cogs.nbu.acad.bg Cheers, Ross Gayler From marwan at ee.usyd.edu.au Wed Aug 12 08:46:09 1998 From: marwan at ee.usyd.edu.au (Marwan Jabri) Date: Wed, 12 Aug 1998 22:46:09 +1000 (EST) Subject: academic positions at the University of Sydney Message-ID: Lecturer/Senior Lecturer in Computer Engineering (2 positions) School of Electrical & Information Engineering The University of Sydney, Australia Reference No. A30/01 The School of Electrical & Information Engineering is expanding its teaching and research programs in the area of computer engineering, and invites applications for two continuing track positions in that area. Particular areas of interest include: neuromorphic engineering; VSLI systems; VLSI based architectures and applications; embedded hardware/software co-design; computer hardware and architectures; advanced digital engineering; and parallel and distributed processing. The existing academic staff in related areas have research interests in low power VLSI, neuromorphic engineering, artificial intelligence, real-time systems, biomedical systems, non linear and adaptive control, communications systems and image processing. Applicants at the Senior Lecturer level should have a PhD in Electrical or Computer Engineering. Applicants at the Lecturer level should have or should be expecting soon a PhD. PhD's in computer science with a substantial engineering background may be considered. The position requires commitment to leadership in research and teaching. Candidates must have a high level of research outcomes and would have the capability to teach undergraduate classes and supervise research students. Undergraduate teaching experience and experience of industrial applications are desirable. Membership of a University approved superannuation scheme is a condition of employment for new appointees. For further information contact Professor M A Jabri, on (+61-2) 9351 2240, fax: (+61-2) 9351 7209, email: , or the Head of Department, Professor D J Hill, on (+61-2) 9351 4647, fax: (+61-2) 9351 3847, email: . Salary: Senior Lecturer $57,610 - $66,429 pa. (increasing to $59,338 - $68,422 pa. on 1/10/98) Lecturer $47,029 - $55,848 pa. (increasing to $48,440 - $57,523 pa. on 1/10/98) (Level of appointment and responsibility will be commensurate with qualifications and experience) Academic staff at the School of Electrical and Information Engineering are currently entitled to receive a market loading (up to a maximum of 33.33% of base salary). The market loading scheme is expected to continue until the end of 1999 when it will be reviewed. Closing: 29 October 1998 Application Information ----------------------- No smoking in the workplace is University policy. Equal employment opportunity is University policy. Other than in exceptional circumstances all vacancies within the University are advertised in the Bulletin Board and on the World Wide Web. Intending applicants are encouraged to seek further information from the contact given before submitting a formal application. Application Method ------------------ Four copies of the application, quoting reference no., including curriculum vitae, list of publications and the names, addresses and fax numbers of five referees. Applications should be forwarded to: The Personnel Officer (College of Sciences and Technology), Carslaw Building, (F07) The University of Sydney NSW 2006 Australia The University reserves the right not to proceed with any appointment for financial or other reasons. From ruderman at salk.edu Wed Aug 12 13:12:00 1998 From: ruderman at salk.edu (Dan Ruderman) Date: Wed, 12 Aug 1998 10:12:00 -0700 (PDT) Subject: Preprint available on ICA of time-varying natural images Message-ID: The following preprint is available via the web: Independent component analysis of image sequences yields spatiotemporal filters similar to simple cells in primary visual cortex J.H. van Hateren and D.L. Ruderman Proc. R. Soc. Lond. B, in press Abstract Simple cells in primary visual cortex process incoming visual information with receptive fields localized in space and time, bandpass in spatial and temporal frequency, tuned in orientation, and commonly selective for the direction of movement. It is shown that performing independent component analysis on video sequences of natural scenes produces results with qualitatively similar spatiotemporal properties. Whereas the independent components of video resemble moving edges or bars, the independent component filters, i.e. the analogues of receptive fields, resemble moving sinusoids windowed by steady gaussian envelopes. Contrary to earlier ICA results on static images, which gave only filters at the finest possible spatial scale, the spatiotemporal analysis yields filters at a range of spatial and temporal scales. Filters centered at low spatial frequencies are generally tuned to faster movement than those at high spatial frequencies. http://hlab.phys.rug.nl/papers/pvideow.html From tho at james.hut.fi Wed Aug 12 14:23:47 1998 From: tho at james.hut.fi (Timo Honkela) Date: Wed, 12 Aug 1998 21:23:47 +0300 (EEST) Subject: Connectionist symbol processing: any progress? In-Reply-To: Message-ID: On Tue, 11 Aug 1998, Herbert L. Roitblat wrote: > I don't have answers, but I might add to the bibliography. Part of the > problem with symbol processing approaches is that they are not as truly > successful as they might pretend. The syntactic part of symbol processing > is easy and reliable, but the semantic part is still a muddle. In > English, for example, the words do not really have a systematic meaning as > Fodor and others would assert. (...) This point of view is most welcome: the view of symbol processing is often rather limited, e.g., to systematicity and structural issues. The area of natural language semantics and pragmatics is vast, and in my opinion, especially unsupervised connectionist learning paradigm, e.g., self-organizing map, provides a basis for modeling and understanding phenomena such as subjectivity of interpretation, intersubjectivity, meaning in context, negotiating meaning, emergence of categories not explicitly given a priori, adaptive prototypes, etc. (please see Timo Honkela: Self-Organizing Maps in Natural Language Processing, Helsinki University of Technology, PhD thesis, 1997; http://www.cis.hut.fi/~tho/thesis/). Adding to the list of references I would like to mention, e.g., MacWhinney, B. (1997). Cognitive approaches to language learning, chapter Lexical Connectionism. MIT Press. (see http://psyscope.psy.cmu.edu/Local/Brian/) Miikkulainen, R. (1997). Self-organizing feature map model of the lexicon. Brain and Language, 59:334-366. (see http://www.cs.utexas.edu/users/risto/) Also Peter G"ardenfors has written several relevant articles in this area (see http://lucs.fil.lu.se/Staff/Peter.Gardenfors/). Best regards, Timo Honkela ------------------------------------------------------------------- Timo Honkela, PhD Timo.Honkela at hut.fi http://www.cis.hut.fi/~tho/ Neural Networks Research Centre, Helsinki University of Technology P.O.Box 2200, FIN-02015 HUT, Finland Tel. +358-9-451 3275, Fax +358-9-451 3277 From jfeldman at ICSI.Berkeley.EDU Wed Aug 12 14:33:46 1998 From: jfeldman at ICSI.Berkeley.EDU (Jerry Feldman) Date: Wed, 12 Aug 1998 11:33:46 -0700 Subject: Connectionist symbol processing: any progress? Message-ID: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> Dave Touretsky asks how well we are doing at making neurally plausible models of human symbolic processes like natural language. Let's start at CMU. One of the driving sources of the "new connectionism" was the interactive activation model of McClelland and Rumelhart; Jay McC continues to work on this as well as other things. John Anderson's spreading activation ACT* models continue to attract (annual?) workshops. More broadly, the various basic ideas of connectionist modeling are playing an important (sometimes dominant) role in several fields that deal with language and symbolic behavior. For example, Elman nets continue to be a standard way to do models in Cognitive Psychology. The text and workbook on "Rethinking Innateness" by Elman, et.al. is a major force in Developmental Psychology. Spreading activation models underlie all priming work in Cognitive Psychology and Psycholinguistics. In Neuropsychology, Damasio's convergence zones continue to attract serious attention. Paul Smolensky's Harmony Theory (in a simplified form) has become a dominant paradigm in phonology and is beginning to play a large role in discussions of grammar - witness the long invited article in Science this year. Shastri's note lists a number of other relevant efforts. In Artificial Intelligence, Belief Networks have arguably replaced Symbolic Logic as the leading paradigm. The exact relation between Belief Networks and structured connectionist models remains to be worked out and this would be a good topic for discussion on this list. For a good recent example, see the (prize) paper by Srini Narayanan and Dan Jurafsky at CogSci98. It is true that none of this is much like Touretsky's early attempt at a holographic LISP and that there has been essentially no work along these lines for a decade. There are first order computational reasons for this. These can be (and have been) spelled out technically but the basic idea is straightforward - PDP (Parallel Distributed Processing) is a contradiction in terms. To the extent that representing a concept involves all of the units in a system, only one concept can be active at a time. Dave Rumelhart says this is stated somewhere in the original PDP books, but I forget where. The same basic point accounts for the demise of the physicists' attempts to model human memory as a spin glass. Distributed representations do occur in the brain and are useful in many tasks, conceptual representation just isn't one of them. The question of how the connectionist brain efficiently realizes (and learns) symbolic processes like language is one of the great intellectual problems of our time. I hope that people on this list will continue to contribute to its solution. -- Jerry Feldman From gutkin at cnbc.cmu.edu Wed Aug 12 16:58:44 1998 From: gutkin at cnbc.cmu.edu (Boris Gutkin) Date: Wed, 12 Aug 1998 16:58:44 -0400 (EDT) Subject: Paper on ISI variability Message-ID: Dear Colleagues, we wanted to bring to your attention our recently published paper on spike generating dynamics and ISI varibility in cortical neurons, which is available in postscript from: cnbc.cmu.edu in pub/user/gutkin the file is type1noise.ps Cheers, Boris Gutkin _______________________________ Neural Computation 10(5) Dynamics of membrane excitability determine inter-spike interval variability:a link between spike generation mechanisms and cortical spike trainstatistics. Boris S. Gutkin and G. Bard Ermentrout Program in Neurobiology and Dept. of Mathematics University of PIttsburgh Pittsburgh PA We propose a biophysical mechanism for the high interspike interval variability observed in cortical spike trains. The key lies in the non-linear dynamics of cortical spike generation which are consistent with type I membranes where saddle-node dynamics underlie excitability. [Rinzel '89]. We present a canonical model for type I membranes, the $\theta$-neuron. The $\theta$-neuron is a phase model whose dynamics reflect salient features of type I membranes. This model generates highly var iable spike trains (coefficient of variation (cv) above 0.6) when brought to firing by noisy inputs. This happens because the timing of spikes for a type I excitable cell is exquisitely sensitive to the amplitude of the supra-threshold stimulus pulses. A noisy input current, giving random amplitude "kicks" to the cell, evokes highly irregular firing across a wide range of firing rates. On the other hand, an intrinsically oscillating cell gives regular spike trains. We corroborate the above results with simulations of Morris-Lecar (M-L) neural model with random synaptic inputs. Type I M-L yields high cv's. When this model is modified to have type II dynamics (periodicity arises via a Hopf bifurcation), it gives regular spike trains (cv below 0.3). Our results suggest that the high cv values such as those observed in cortical spike trains are an intrinsic characteristic of type Imembrane driven to firing by "random" inputs. In contrast, neural oscillators or neurons exhibiting type II excitability shouldproduce regular spike trains. From brianh at shn.net Wed Aug 12 20:31:22 1998 From: brianh at shn.net (Brian Hazlehurst) Date: Wed, 12 Aug 1998 17:31:22 -0700 Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> (message from Jerry Feldman on Wed, 12 Aug 1998 11:33:46 -0700) Message-ID: Dave Touretsky's question about the state of research in connectionist symbol processing has elicited interesting responses showing that some interpret the problem as computational, some see the problem as biological, while others see the problem as more narrowly neurological. To this mix, let me add a view that the problem of symbols is (also) social and historical. If this idea interests you, I invite you to read a recent publication: The Emergence of Propositions from the Co-ordination of Talk and Action in a Shared World Brian Hazlehurst brianh at shn.net Edwin Hutchins hutchins at cogsci.ucsd.edu Abstract: We present a connectionist model that demonstrates how propositional structure can emerge from the interactions among members of a community of simple cognitive agents. We first describe a process in which agents coordinating their actions and verbal productions with each other in a shared world leads to the development of propositional structures. We then present a simulation model which implements the process for generating propositions from scratch. We report and discuss the behavior of the model in terms of its ability to produce three properties of propositions: (1) a coherent lexicon characterized by shared form-meaning mappings; (2) conventional structure in the sequence of forms; (3) the predication of spatial facts. We show that these properties do not emerge when a single individual learns the task alone and conclude that the properties emerge from the demands of the communication task rather than from anything inside the individual agents. We then show that the shared structural principles can be described as a grammar, and discuss the implications of this demonstration for theories concerning the origins of the structure of language. In: K. Plunket (Ed.), Language Acquisition and Connectionism. Special Issue of *Language and Cognitive Processes*, 1998, 13 (2/3), 373-424. ///////////////////////////////////////////////////// Brian Hazlehurst Chief Scientist Sapient Health Network brianh at shn.net www.shn.net ///////////////////////////////////////////////////// From goldfarb at unb.ca Wed Aug 12 20:52:09 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Wed, 12 Aug 1998 21:52:09 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: On Tue, 11 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? > ................................................................. > So I concluded that connectionist symbol processing had reached a > plateau, and further progress would have to await some revolutionary > new insight about representations. > .................................................................... > Today, Michael Arbib is working on the second edition of his handbook, > and I've been asked to update my article on connectionist symbol > processing. Is it time to write an obituary for a research path that > expired because the problems were too hard for the tools available? > Or are there important new developments to report? > > I'd love to hear some good news. David, I'm afraid, I haven't got the "good news", but, who knows, some good may still come out of it. About 8-9 years ago, soon after the birth of the connectionists mailing list, there was a discussion somewhat related to the present one. I recall stating, in essence, that it doesn't make sense to talk about the connectionist symbol processing simply because the connectionist representation space--the vector space over the reals--by its very definition (recall the several axioms that define it) doesn't allow one to "see" practically any symbolic operations, and therefore one cannot construct, or learn, in it (without cheating) the corresponding inductive class representation. I have been reluctant to put a substantial effort into a formal proof of this statement since I believe (after so many years of working with the symbolic data) that it is, in some sense, quite obvious (see also [1-3]). Let me try, again, to clarify the above. Hacking apart, the INPUT SPACE of a learning machine must be defined axiomatically, as is the now universal practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of the learning machine, i.e. the bias related to the class of permitted object operations (compare with the central CS concept of abstract data type). There could be, of course, other, additional, biases related to different classes of learning algorithms each operating, however, in the SAME input space (compare, for example, with the Chomsky overall framework for languages and its various subclasses of languages). It appears that the present predicament is directly related to the fact that, historically, in mathematics, there was, essentially, no work done on the formalization of the concept of "symbolic" representation space. Apparently, such spaces are nontrivial generalizations of the classical representation spaces, the latter being used in all sciences and have evolved from the "numeric" spaces. I emphasize "in mathematics" since logic (including computability theory) does not deal with the representation spaces, where the "representation space" could be thought of as a generalization of the concept of MEASUREMENT SPACE. By the way, "measurement" implies the presence of some distance measure(s) defined on the corresponding space, and that is the reason why the study of such spaces belongs to the domain of mathematics rather than logic. It appears to us now that there are fundamental difference between the two classes of "measurement spaces": the "symbolic" and the "numeric" spaces (see my home page). To give you at least some idea about the differences, I am presenting below the "symbolic solution" (without the learning algorithm) to the generalized parity problem, the problem quite notorious within the connectionist community. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ THE PARITY CLASS PROBLEM The alphabet: A = {a, b} ------------ Input set S (i.e. the input space without the distance function): The set ----------- of strings over A. The parity class C: The set of strings with an even number of b's. ------------------ Example of a positive training set C+: aababbbaabbaa ------------------------------------- baabaaaababa abbaaaaaaaaaaaaaaa bbabbbbaaaaabab aaa Solution to the parity problem, i.e. inductive (parity) class representation: ----------------------------------------------------------------------------- One element from C+, e.g. 'aaa', plus the following 3 weighted operations operations (note that the sum of the weights is 1) deletion/insertion of 'a' (weight 0) deletion/insertion of 'b' (weight 1) deletion/insertion of 'bb' (weight 0) This means, in particular, that the DISTANCE FUNCTION D between any two strings from the input set S is now defined as the shortest weighted path (based on the above set of operations) between these strings. The class is now defined as the set of all strings in the measurement space (S,D) whose distance from aaa is 0. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Why do, then, so many people work on the "connectionist symbol processing"? On the one hand, many of us feel (correctly, in my opinion) that the symbolic representation is a very important topic. On the other hand, and I am quite sure of that, if we look CAREFULLY at any corresponding concrete implementation, we would see that in order to "learn" the chosen symbolic class one had to smuggle into the model, in some form, some additional structure "equivalent" to the sought symbolic structure (e.g. in the form of the recurrent ANN's architecture). This is, again, due to the fact that in the vector space one simply cannot detect (in a formally reasonable manner) any non-vector-space operations. [1] L. Goldfarb, J. Abela, V.C. Bhavsar, V.N. Kamat, Can a vector space based learning model discover inductive class generalization in a symbolic environment? Pattern Recognition Letters, 16 (7), 1995, pp. 719-726. [2] L. Goldfarb and J. Hook, Why classical models for pattern recognition are not pattern recognition models, to appear in Proc. Intern. Conf. on Advances in Pattern Recognition (ICAPR), ed. Sameer Singh, Plymouth, UK, 23-25 Nov. 1998, Springer. [3] V.C. Bhavsar, A.A. Ghorbany, L. Goldfarb, Artificial neural networks are not learning machines, Tech. Report, Faculty of Computer Science, U.N.B. --Lev Goldfarb http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm From Lakhmi.Jain at unisa.edu.au Wed Aug 12 21:55:23 1998 From: Lakhmi.Jain at unisa.edu.au (Lakhmi Jain) Date: Thu, 13 Aug 1998 11:25:23 +0930 Subject: Please consider to circulate widely Message-ID: <5DA42F8A3C18D111979400AA00DD609D012F62A7@EXSTAFF3.Levels.UniSA.Edu.Au> Please consider to circulate widely INTERNATIONAL SERIES ON COMPUTATIONAL INTELLIGENCE CRC Press, USA Series Editor-in-chief: L.C. Jain Proposals from authors and editors for future volumes are welcome. Knowledge-Based computational intelligence techniques involve the use of computers to enable machines to simulate human performance. The prominent paradigms used include expert systems, artificial neural networks, fuzzy logic, evolutionary computing techniques and chaos engineering. These knowledge-based computational intelligence techniques have generated tremendous interest among scientists and application engineers due to a number of benefits such as generalization, adaptation, fault tolerance and self-repair, self-organization and evolution. Successful demonstration of the applications of knowledge-based systems theories will aid scientists and engineers in finding sophisticated and low cost solutions to difficult problems. This new series is unique in that it presents novel designs and applications of knowledge-based computational intelligence techniques in engineering, science and related fields. L.C. Jain, BE(Hons), ME, PhD, Fellow IE(Aust) Founding Director Knowledge-Based Intelligent Engineering Systems Cente University of South Australia Adelaide Mawson Lakes, SA 5095 Australia L.Jain at unisa.edu.au Please send your proposal including Title of the book General description:what need does this book fill? List 5 key features of the book Table of contents Preface Primary and secondary markets: Who will buy it? Competing and/or related books and publishers: What are the advantages of your book over the competition? A brief CV or biography, with relevant publications Expected manuscript delivery date From makowski at neovista.com Thu Aug 13 18:00:05 1998 From: makowski at neovista.com (Greg Makowski) Date: Thu, 13 Aug 1998 15:00:05 -0700 Subject: 4+ immedate job openings for Knowledge Discovery Engineers Message-ID: <35D361E5.32CC4C8B@neovista.com> 4+ immediate job openings Knowledge Discovery Engineer (KDE) for NeoVista Software Job Description In this position, candidates will be responsible for the implementation and technical success of business solutions through the application of NeoVista's Decision Series or vertical software products. NeoVista puts business value of customer solutions first. Candidates who will be strongly preferred are those with business deployment experience in at least one, and preferably more, of the following analytic techniques: neural networks, decision trees (i.e. C5.0, CART or CHAID), naove Bayes, clustering or association rules. Strong experience solving business problems with modeling process, regression, statistics or other analytic numeric methods is also relevant for this position. KDE's should to be proficient in programming and manipulating large data systems as part of the analysis. Analytic projects may involve 10-50 GB of data, and may be contained in a half dozen tables to as many as 75 tables. Strong experience in either SQL or SAS is a core skill set. Experience in Unix scripting, Java, C++, or C can be helpful. Must have BA/BS or equivalent with 5+ years of experience in the computer industry, although most of our KDE's have an MS or Ph.D. One or more years of experience with large scale data mining/knowledge discovery/pattern recognition projects as an implementor is desirable. Helpful experience includes analytic or business experience in our vertical markets. Our vertical markets include retail (supply chain optimization problems), insurance, banking or telco (Customer Relationship Marketing problems such as customer retention, segmentation or fraud). Experience in targeted marketing analytics or deployment is useful. Candidates are not required to have such vertical market experience. Other helpful experience includes: presentation, written and client communication skills, project management or proposal writing. Periodic travel may be required for project presentations, or for on site analytic work. This job description describes a "center" of the set of skills useful for this position. It may be to the advantage for candidates to send a cover letter detailing specific experiences, strengths and weakness to fill a KDE role. Preferred locations for employment are: Cupertino, CA, Dallas, Atlanta or other major metropolitan areas. NeoVista Software has an immediate need to hire multiple KDE's. Pre-Sales Knowledge Discovery Engineer (KDE) for NeoVista Software Job Description People in this position work with a sales executive to listen to prospects, understand their business problems, and to propose compelling KDD solution. Giving presentations and or demos to small groups of business VP's and their technical advisors is a common activity. It is important to be able to communicate effectively to business executives and individuals that may be technical in either IT or analytic methods. Travel for 1-3 day trips twice a month may not be uncommon. Once the pre-sales KDE proposes a solution, they may write for a proposal the project plan, description of major tasks, project deliverables and possible business deployment strategies. They may also play an account management role with regard to technical issues. The pre-sales KDE should be able to meet the majority of the requirements for a KDE, however the distinction between KDE's and pre-sales KDE's occurs in degrees. Preferred locations for employment are: Cupertino, CA, Dallas, Atlanta or other major metropolitan areas. NeoVista Software has immediate openings for multiple pre-sales KDE's. Application Directions: Please email or fax a resume and a cover letter addressed to me. I will be travelling the week of 8/17-21/98, so during this time please send both an email and fax. Greg Makowski (pre-sales KDE) NeoVista Software, Inc. 10710 North Tantau Avenue Cupertino, CA 95014 makowski at neovista.com (Word or text resume) (408) 777-2930 fax (408) 343-4239 phone www.neovista.com Thank you for your time, Greg -------------- next part -------------- A non-text attachment was scrubbed... Name: vcard.vcf Type: text/x-vcard Size: 427 bytes Desc: Card for Greg Makowski Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/c87d5416/vcard-0001.vcf From rickert at cs.niu.edu Thu Aug 13 13:56:18 1998 From: rickert at cs.niu.edu (Neil W Rickert) Date: Thu, 13 Aug 1998 14:56:18 -0300 Subject: Connectionist symbol processing: any progress? Message-ID: <12011.903026162@ux.cs.niu.Edu> On Wed, 12 Aug 1998 Lev Goldfarb wrote: > Hacking apart, the INPUT SPACE of >a learning machine must be defined axiomatically, as is the now universal >practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of >the learning machine, i.e. the bias related to the class of permitted >object operations (compare with the central CS concept of abstract data >type). Why? This seems completely wrong to me. It seems to me that what you have stated can be paraphrased as: The knowledge that the learning machine is to acquire must be pre-encoded into the machine as (implicit or explicit) innate structure. But, if my paraphrase is correct, then one wonders why such a machine warrants the name "learning machine." Presumably we have very different ideas as to what is learning. I take it that human learning is the process of discovering the nature of our world. More generally I take it that, at its most fundamental level, learning is the process of discovering the structure of the input space, a process which may require the eventual production of an axiomatization (a set of natural laws) for that space. From goldfarb at unb.ca Thu Aug 13 15:09:27 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Thu, 13 Aug 1998 16:09:27 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <12011.903026162@ux.cs.niu.Edu> Message-ID: On Thu, 13 Aug 1998, Neil W Rickert wrote: > On Wed, 12 Aug 1998 Lev Goldfarb wrote: > > > Hacking apart, the INPUT SPACE of > >a learning machine must be defined axiomatically, as is the now universal > >practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of > >the learning machine, i.e. the bias related to the class of permitted > >object operations (compare with the central CS concept of abstract data > >type). > > Why? This seems completely wrong to me. > > It seems to me that what you have stated can be paraphrased as: > > The knowledge that the learning machine is to acquire must be > pre-encoded into the machine as (implicit or explicit) innate > structure. Neil, Your paraphrase is wrong: my statement refers to the fact that the fundamental bias related to the (evolutionary) structure of the environment must be postulated. [Again, compare with the concept of abstract data type (ADT), which has been found to be ABSOLUTELY indispensable in computer science for dealing with various type of data]. > But, if my paraphrase is correct, then one wonders why such a machine > warrants the name "learning machine." > > Presumably we have very different ideas as to what is learning. I > take it that human learning is the process of discovering the nature > of our world. More generally I take it that, at its most fundamental > level, learning is the process of discovering the structure of the > input space, a process which may require the eventual production of > an axiomatization (a set of natural laws) for that space. I am suggesting "that, at its most fundamental level, learning is the process of discovering" the inductive class structure and not "the structure of the input space". The latter would be simply impossible to discover unless one is equipped with exactly the same bias. If the machine is not equipped with the "right" evolutionary bias (about the COMPOSITIONAL STRUCTURE OF OBJECTS in the universe) hardly any reliable inductive learning is possible. On the other hand, if it is equipped with the right bias (about the overall structure of objects) then the inductive learning could proceed in the biologically familiar manner. I believe that the "tragedy" of the connectionist symbol processing is directly related to the fact that the vector space bias is structurally too simple/restrictive and has hardly anything to do with the symbolic bias, the latter being a simple example of the "right" structural evolutionary bias of the universe. Cheers, Lev From ruppin at math.tau.ac.il Thu Aug 13 15:10:49 1998 From: ruppin at math.tau.ac.il (Eytan Ruppin) Date: Thu, 13 Aug 1998 22:10:49 +0300 (GMT+0300) Subject: Connectionist symbol processing: any progress? Message-ID: <199808131910.WAA02170@gemini.math.tau.ac.il> Jerry Feldman writes: > - PDP (Parallel Distributed Processing) is a contradiction in terms. > To the extent that representing a concept involves all of > the units in a system, only one concept can be active at > a time. Dave Rumelhart says this is stated somewhere in > the original PDP books, but I forget where. The same > basic point accounts for the demise of the physicists' > attempts to model human memory as a spin glass. There is really no ``contradiction in terms'' here. Indeed, associative memory networks (or attractor neural networks) can activate only one stored concept at a time. However, such networks should not be viewed as representing the whole brain but should be (and indeed are) viewed as representing modular cortical structures such as columns. Given this interpretation, the problem is resolved; If these networks are sufficiently loosely coupled then many patterns can be activated together, resulting in complex and rich dynamics. We should be careful before discarding our models on false grounds. We have too few viable models that can serve as paradigms of information processing. Best wishes, Eytan Ruppin. From mitsu at ministryofthought.com Thu Aug 13 16:11:14 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Thu, 13 Aug 1998 13:11:14 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D34861.CCE67E25@ministryofthought.com> I'm not quite sure I agree with your analysis. Since I haven't looked at it in great detail, I present this as a tentative critique of your presentation. Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo their finite size, they are Turing equivalent), then unless you are saying that your represention cannot be implemented on a Turing machine, then it is clearly NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic representations. Not having looked at your scheme in detail, of course, I don't know whether your scheme somehow is unimplementable even on a Turing machine, but it seems to me you must not be claiming this. You seem to be basing your argument on the notion that the input space to a recurrent ANN is a set of numbers, which you interpret as the coordinates of a vector. However, this is only a kind of vague analogy, since the field operations of the vector space (addition, multiplication, etc.) have no clear meaning on the input space. "Adding" two input vectors does not necessarily result in anything meaningful except in the sense that the recurrent ANN to be useful must be locally stable with respect to small variations in the input. However, the actual structure or metric of the input space is in some sense determined not a priori but by the state of the recurrent ANN itself, and can change over time both as a result of training and as a result of iteration. The input space is numbers, yes, but that doesn't make it a vector space. For example, what properties of the input would be preserved if I, say, added the vector (10^25, 10^25, ..) to the input? If it is a "vector space" then that operation would yield something sensible, some symmetries, and yet it obviously does not. Thus, while I sympathize with your claim that the vector field of R(n) does not admit to the structure necessary to make visible much symbolic structure, this in itself does not doom connectionist symbol processing by any means. Your argument does have weight when applied to a single-layer perceptron, which is, after all, just a thresholded/distorted linear transformation. Although it seemed to take the early connectionist community by surprise, it should be no surprise at all that a single-layer perceptron cannot learn the parity problem, because obviously the parity problem is not linearly separable, and how could any linear discriminator possibly learn a non-linearly-separable problem? However, we do not live in a world of single-layer perceptrons. Because networks are more complex than this, arguments about the linearity of the input space seem to me rather irrelevant. I suspect you mean something else, however. I think the intuitive point you are perhaps trying to make is that symbolic representations are arbitrarily nestable (recursively recombinable), and an input space which consists of a fixed number of dimensions cannot handle recursive combinations. However, one can use time-sequence to get around this problem (as we all are doing when we read and write for example). Rather than make our eyes, for example, capable of handling arbitrarily recombinable input all at once, we sequence the input to our eyes by reading material over time. The same trick can be used with recurrent networks for example. Mitsu Lev Goldfarb wrote: > David, > I'm afraid, I haven't got the "good news", but, who knows, some good may > still come out of it. > > About 8-9 years ago, soon after the birth of the connectionists mailing > list, there was a discussion somewhat related to the present one. I recall > stating, in essence, that it doesn't make sense to talk about the > connectionist symbol processing simply because the connectionist > representation space--the vector space over the reals--by its very > definition (recall the several axioms that define it) doesn't allow one to > "see" practically any symbolic operations, and therefore one cannot > construct, or learn, in it (without cheating) the corresponding inductive > class representation. I have been reluctant to put a substantial effort > into a formal proof of this statement since I believe (after so many years > of working with the symbolic data) that it is, in some sense, quite > obvious (see also [1-3]). > > Let me try, again, to clarify the above. Hacking apart, the INPUT SPACE of > a learning machine must be defined axiomatically, as is the now universal > practice in mathematics. These axioms define the BASIC OPERATIONAL BIAS of > the learning machine, i.e. the bias related to the class of permitted > object operations (compare with the central CS concept of abstract data > type). There could be, of course, other, additional, biases related to > different classes of learning algorithms each operating, however, in the > SAME input space (compare, for example, with the Chomsky overall framework > for languages and its various subclasses of languages). > > It appears that the present predicament is directly related to the fact > that, historically, in mathematics, there was, essentially, no work done > on the formalization of the concept of "symbolic" representation space. > Apparently, such spaces are nontrivial generalizations of the classical > representation spaces, the latter being used in all sciences and have > evolved from the "numeric" spaces. I emphasize "in mathematics" since > logic (including computability theory) does not deal with the > representation spaces, where the "representation space" could be thought > of as a generalization of the concept of MEASUREMENT SPACE. By the way, > "measurement" implies the presence of some distance measure(s) defined on > the corresponding space, and that is the reason why the study of such > spaces belongs to the domain of mathematics rather than logic. > > It appears to us now that there are fundamental difference between the two > classes of "measurement spaces": the "symbolic" and the "numeric" spaces > (see my home page). To give you at least some idea about the differences, > I am presenting below the "symbolic solution" (without the learning > algorithm) to the generalized parity problem, the problem quite notorious > within the connectionist community. > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > THE PARITY CLASS PROBLEM > > The alphabet: A = {a, b} > ------------ > > Input set S (i.e. the input space without the distance function): The set > ----------- > of strings over A. > > The parity class C: The set of strings with an even number of b's. > ------------------ > > Example of a positive training set C+: aababbbaabbaa > ------------------------------------- baabaaaababa > abbaaaaaaaaaaaaaaa > bbabbbbaaaaabab > aaa > > Solution to the parity problem, i.e. inductive (parity) class representation: > ----------------------------------------------------------------------------- > > One element from C+, e.g. 'aaa', plus the following 3 weighted operations > operations (note that the sum of the weights is 1) > deletion/insertion of 'a' (weight 0) > deletion/insertion of 'b' (weight 1) > deletion/insertion of 'bb' (weight 0) > > This means, in particular, that the DISTANCE FUNCTION D between any two > strings from the input set S is now defined as the shortest weighted path > (based on the above set of operations) between these strings. The class is > now defined as the set of all strings in the measurement space (S,D) whose > distance from aaa is 0. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Why do, then, so many people work on the "connectionist symbol > processing"? On the one hand, many of us feel (correctly, in my opinion) > that the symbolic representation is a very important topic. On the other > hand, and I am quite sure of that, if we look CAREFULLY at any > corresponding concrete implementation, we would see that in order to > "learn" the chosen symbolic class one had to smuggle into the model, in > some form, some additional structure "equivalent" to the sought symbolic > structure (e.g. in the form of the recurrent ANN's architecture). This is, > again, due to the fact that in the vector space one simply cannot detect > (in a formally reasonable manner) any non-vector-space operations. > > > [1] L. Goldfarb, J. Abela, V.C. Bhavsar, V.N. Kamat, Can a vector space > based learning model discover inductive class generalization in a > symbolic environment? Pattern Recognition Letters, 16 (7), 1995, pp. > 719-726. > > [2] L. Goldfarb and J. Hook, Why classical models for pattern recognition > are not pattern recognition models, to appear in Proc. Intern. Conf. > on Advances in Pattern Recognition (ICAPR), ed. Sameer Singh, > Plymouth, UK, 23-25 Nov. 1998, Springer. > > [3] V.C. Bhavsar, A.A. Ghorbany, L. Goldfarb, Artificial neural networks > are not learning machines, Tech. Report, Faculty of Computer Science, > U.N.B. > > > --Lev Goldfarb > > http://wwwos2.cs.unb.ca/profs/goldfarb/goldfarb.htm From bryan at cog-tech.com Thu Aug 13 16:57:14 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Thu, 13 Aug 1998 16:57:14 -0400 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> (Dave_Touretzky@cs.cmu.edu) Message-ID: <199808132057.QAA14403@cti2.cog-tech.com> Hello, We are currently engaged in cognitive modeling of reflexive (recognitional) and reflective (metacognitive) behaviors. To this end, we have used a structured connectionist model of inferential long-term memory, with good results. The reflexive system is based on the Shruti model proposed by Shastri and Ajjanagadde (1993, 1996). Working with Shastri, we have extended the model to incorporate supervised learning, priming, etc. We are currently working on an integration of belief and utility within this model. The resulting network will be able to not only reflexively construct interpretations of evidence from its environment, but will be able to reflexively plan and execute responses at multiple levels of abstraction as well. The reflexive system is coupled to a metacognitive system, which is responsible for directing the focus of attention, making and testing assumptions, identifying and responding to conflicting interpretations and/or goals, locating unreliable conclusions, and managing risk. The perspective that only a fully distributed representation represents a connectionist solution to structure, variable binding, etc. is perhaps, what warrents challange. The brain is by no measure without internal structure on both gross and very detailed levels. While research has yet to identify rich mechanisms for dealing with structured representations and inference within a fully distributed representation, it has also yet to fully explore the potential of specialized neural structures for systematic reasoning. -- bryan thompson Cognitive Technologies, Inc. bryan at cog-tech.com References: Thompson, B.B., Cohen, M.S., Freeman, J.T. (1995). Metacognitive behavior in adaptive agents. In Proceedings of the World Congress on Neural Networks, (IEEE, July). Cohen, Marvin S. and Freeman, Jared T. (1996). Thinking naturally about uncertainty. In Proceedings of the Human Factors & Ergonomics Society, 40th Annual Meeting. Santa Monica, CA: Human Factors Society. From mike at cns.bu.edu Thu Aug 13 22:00:20 1998 From: mike at cns.bu.edu (Michael Cohen) Date: Thu, 13 Aug 1998 22:00:20 -0400 Subject: Connectionist symbol processing: any progress? References: <199808131910.WAA02170@gemini.math.tau.ac.il> Message-ID: <35D39A34.417A8D4A@cns.bu.edu> Eytan Ruppin wrote: > There is really no ``contradiction in terms'' here. Indeed, associative > memory networks (or attractor neural networks) can activate only one > stored concept at a time. However, such networks should not be viewed as > representing the whole brain but should be (and indeed > are) viewed as representing modular cortical structures such as columns. > Given this interpretation, the problem is resolved; If these networks are > sufficiently loosely coupled then many patterns can be > activated together, resulting in complex and rich dynamics. > > We should be careful before discarding our models on false grounds. We have > too few viable models that can serve as paradigms of information processing. > > Best wishes, > > Eytan Ruppin. I think the real question is what substantial validated progress has been made over anabove Formal Language // Transformational Grammar // Standard Artificial Intelligence on aspects of human language processing or parsing via connectionists methods. Have we better understood technological projects such as machine translation or semantic processing or have hard experimental evidence that these techniques are valuable in understanding what humans do. If not other ideas had best be persued, if so then we should keep on plugging away at the problem using these techniques. Its not whether in principle something can be stated using a network automaton. Its whether the language is good for the problem at hand. Surely Thermodynamics is Turing Computable, however Turing Machines alas are good in and of themselves for almost nothing save producing Models of Computability in which to guage complexity of Algorithms. --mike -- Michael Cohen mike at cns.bu.edu Associate Professor, Center for Adaptive Systems Work: 677 Beacon, Street, Rm313 Boston, Mass 02115 Home: 25 Stearns Rd, #3 Brookline, Mass 02146 Tel-Work: 617-353-9484 Tel-Home:617-353-7755 From r.gayler at psych.unimelb.edu.au Fri Aug 14 07:57:14 1998 From: r.gayler at psych.unimelb.edu.au (Ross Gayler) Date: Fri, 14 Aug 1998 21:57:14 +1000 Subject: Connectionist symbol processing: any progress? Message-ID: <3.0.32.19980814214225.00699820@myriad.unimelb.edu.au> At 11:33 12/08/98 -0700, Jerry Feldman wrote: .. > It is true that none of this is much like Touretsky's >early attempt at a holographic LISP and that there has >been essentially no work along these lines for a decade. >There are first order computational reasons for this. >These can be (and have been) spelled out technically >but the basic idea is straightforward - PDP (Parallel >Distributed Processing) is a contradiction in terms. To >the extent that representing a concept involves all of >the units in a system, > only one concept can be active at a time. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > Dave Rumelhart says this is stated somewhere in >the original PDP books, but I forget where. The same >basic point accounts for the demise of the physicists' >attempts to model human memory as a spin glass. >Distributed representations do occur in the brain and >are useful in many tasks, conceptual representation just >isn't one of them. .. I would like to see where it has been "spelled out technically" that in a connectionist system "only one concept can be active at a time", because there must be some false assumptions in the proof. This follows from the fact that the systems developed by, for example, Smolensky, Kanerva, Plate, Gayler, and Halford et al *depend* on the ability to manipulate multiple superposed representations, and they actually work. I do accept that > It is true that none of this is much like Touretsky's >early attempt at a holographic LISP and partially accept that there has >been essentially no work along these lines for a decade. but explain it by: 1) Touretzky's work was an important demonstration of technical capability but not a serious attempt at a cognitive architecture. There is no reason to extend that particular line of work. 2) Although the outer-product architectures can (and have) been used with weight learning procedures, such as backpropagation, one of their major attractions is that so much can be achieved without iterative learning. To pursue this line of research requires the power to come from the architecture rather than an optimisation algorithm and a few thousand degrees of freedom. Therefore, this line of research is much less likely to produce a publishable result in a given time frame for a fixed effort (because you can't paper over the gaps with a few extra df). 3) The high-risk, high-effort nature of research into outer-product cognitive architectures without optimisation algorithms makes it unattractive to most researchers. You can't give a problem like this to a PhD student because you don't know the probability of a publishable result. The same argument applies to grant applications. The rational researcher is better advised to attack a more obviously soluble problem. So, I partially disagree with the statement that there has been >essentially no work along these lines for a decade. because there has been related (more cognitively focussed) work proceeding for the last decade. It has just been relatively quiet and carried out by a few people who can afford to take on a high effort, high risk project. Cheers, Ross Gayler From goldfarb at unb.ca Fri Aug 14 09:58:44 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Fri, 14 Aug 1998 10:58:44 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D34861.CCE67E25@ministryofthought.com> Message-ID: On Thu, 13 Aug 1998, Mitsu Hadeishi wrote: > I'm not quite sure I agree with your analysis. Since I haven't looked at it in > great detail, I present this as a tentative critique of your presentation. > > Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo > their finite size, they are Turing equivalent), then unless you are saying that > your represention cannot be implemented on a Turing machine, then it is clearly > NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic > representations. Not having looked at your scheme in detail, of course, I don't > know whether your scheme somehow is unimplementable even on a Turing machine, but > it seems to me you must not be claiming this. Yes, I'm not claiming this. Moreover, I'm not discussing the SIMULATIONAL, or computing, power of a learning model, since in a simulation of a Turing machine one doesn't care how the actual "parameters" are constructed based on a small training set, i.e. no learning is involved. Many present (and future) computational models are "Turing equivalent". That doesn't make them learning models, does it? In other words, as I mentioned in my original posting, IF YOU KNOW THE SYMBOLIC CLASS STRUCTURE of course you can simulate, or encode, it on many machines (similar quote straight from the horse's mouse i.e. from Siegelmann and Sontag paper "On the computational power of neural nets" , section 1.2: "Many other types of 'machines' may be used for universality"). Again, the discovery of the symbolic class structure is a fundamentally different matter, much less trivial than just a simple encoding of this structure using any other, including numeric, structure. > You seem to be basing your argument on the notion that the input space to a > recurrent ANN is a set of numbers, which you interpret as the coordinates of a > vector. However, this is only a kind of vague analogy, since the field > operations of the vector space (addition, multiplication, etc.) have no clear > meaning on the input space. "Adding" two input vectors does not necessarily > result in anything meaningful except in the sense that the recurrent ANN to be > useful must be locally stable with respect to small variations in the input. > However, the actual structure or metric of the input space is in some sense > determined not a priori but by the state of the recurrent ANN itself, and can > change over time both as a result of training and as a result of iteration. The > input space is numbers, yes, but that doesn't make it a vector space. For > example, what properties of the input would be preserved if I, say, added the > vector (10^25, 10^25, ...) to the input? If it is a "vector space" then that > operation would yield something sensible, some symmetries, and yet it obviously > does not. Thus, while I sympathize with your claim that the vector field of R(n) > does not admit to the structure necessary to make visible much symbolic > structure, this in itself does not doom connectionist symbol processing by any > means. It appears that something VERY BASIC is missing from the above description: How could a recurrent net learn without some metric and, as far as I know, some metric equivalent to the Euclidean metric? (All "good' metrics on a finite-dimensional vector space are equivalent to the Euclidean, see [1] in my original posting.) > Your argument does have weight when applied to a single-layer perceptron, which > is, after all, just a thresholded/distorted linear transformation. Although it > seemed to take the early connectionist community by surprise, it should be no > surprise at all that a single-layer perceptron cannot learn the parity problem, > because obviously the parity problem is not linearly separable, and how could any > linear discriminator possibly learn a non-linearly-separable problem? However, > we do not live in a world of single-layer perceptrons. Because networks are more > complex than this, arguments about the linearity of the input space seem to me > rather irrelevant. I suspect you mean something else, however. Yes, I do. > I think the intuitive point you are perhaps trying to make is that symbolic > representations are arbitrarily nestable (recursively recombinable), and an input > space which consists of a fixed number of dimensions cannot handle recursive > combinations. However, one can use time-sequence to get around this problem (as > we all are doing when we read and write for example). Rather than make our eyes, > for example, capable of handling arbitrarily recombinable input all at once, we > sequence the input to our eyes by reading material over time. The same trick can > be used with recurrent networks for example. Mitsu, I'm not making just this point. The main point I'm making can be stated as follows. Inductive learning requires some object dissimilarity and/or similarity, measure(s). The accumulated mathematical experience strongly suggests that the distance in the input space must be consistent with the underlying operational, or compositional, structure of the chosen object representation (e.g. topological group, topological vector space, etc). It turns out that while the classical vector space (because of the "simple" compositional structure of its objects) allows essentially one metric consistent with the underlying algebraic structure [1], each symbolic "space" (e.g. strings, trees, graphs) allows infinitely many of them. In the latter case, the inductive learning becomes the learning of the corresponding class distance function (refer to my parity example). Moreover, since some noise is always present in the training set, I cannot imagine how RELIABLE symbolic inductive class structure can be learned from a SMALL training set without the right symbolic bias and without the help of the corresponding symbolic distance measures. Cheers, Lev From pierre at mbfys.kun.nl Fri Aug 14 06:20:16 1998 From: pierre at mbfys.kun.nl (Pierre v.d. Laar) Date: Fri, 14 Aug 1998 12:20:16 +0200 Subject: Pruning Using Parameter and Neuronal Metrics Message-ID: <35D40F5F.A71E6B07@mbfys.kun.nl> Dear Connectionists, The following article which has been accepted for publication in Neural Computation can now be downloaded from our ftp-server as ftp://ftp.mbfys.kun.nl/snn/pub/reports/vandeLaar.NC98.ps.Z Yours sincerely, Pierre van de Laar Pruning Using Parameter and Neuronal Metrics written by Pierre van de Laar and Tom Heskes Abstract: In this article, we introduce a measure of optimality for architecture selection algorithms for neural networks: the distance from the original network to the new network in a metric that is defined by the probability distributions of all possible networks. We derive two pruning algorithms, one based on a metric in parameter space and another one based on a metric in neuron space, which are closely related to well-known architecture selection algorithms, such as GOBS. Furthermore, our framework extends the theoretically range of validity of GOBS and therefore can explain results observed in previous experiments. In addition, we give some computational improvements for these algorithms. FTP INSTRUCTIONS unix% ftp ftp.mbfys.kun.nl Name: anonymous Password: (use your e-mail address) ftp> cd snn/pub/reports/ ftp> binary ftp> get vandeLaar.NC98.ps.Z ftp> bye unix% uncompress vandeLaar.NC98.ps.Z unix% lpr vandeLaar.NC98.ps From mitsu at ministryofthought.com Fri Aug 14 14:44:07 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Fri, 14 Aug 1998 11:44:07 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D48576.7EFA0648@ministryofthought.com> Lev, Okay, so we agree on the following: Recurrent ANNs have the computational power required. The only thing at issue is the learning algortihm. >How could a recurrent net learn without some metric and, as >far as I know, some metric equivalent to the Euclidean metric? Your arguments so far seem to be focusing on the "metric" on the input space, but this does not in itself mean anything at all about the metric of the learning algorithm as a whole. Clearly the input space is NOT a vector space in the usual sense of the word, at least if you use the metric which is defined by the whole system (learning algortihm, error measure, state of the network). So, what you are now saying is that the metric must be equivalent (rather than equal to) to the Euclidean metric: you do not define what you mean by this. The learning "metric" in the connectionist paradigm changes over time: it is a function of the structure of the learning algorithm and the state of the network, as I mentioned above. The only sense in which the metric is "equivalent" to the Euclidean metric is locally; that is, due to the need to discriminate noise, this metric must be locally stable, thus there is an open neighborhood around most points in the topological input space for which the "metric" vanishes. However, the metric can be quite complex, it can have singularities, it can change over time, it can fold back onto itself, etc. This local stability may not be of interest, however, since the input may be coded so that each discrete possibility is coded as exact numbers which are separated in space. In this case the input space may not be a continuous space at all, but a discrete lattice or something else. If the input space is a lattice, then there are no small open neighborhoods around the input points, and thus even this similaity to the Euclidean metric no longer applies. At least, so far, your arguments do not seem to show anything beyond this. >It turns out that while >the classical vector space (because of the "simple" compositional >structure of its objects) allows essentially one metric consistent with >the underlying algebraic structure [1], each symbolic "space" (e.g. >strings, trees, graphs) allows infinitely many of them. Recurrent networks spread the representation of a compound symbol over time; thus, you can present a string of symbols to a recurrent network and its internal state will change. You have not shown, it seems to me, that in this case the learning metric would look anything like a Euclidean metric, or that there would be only "one" such metric. In fact it seems obvious to me that this would NOT be the case. I would like to hear why you might disagree. Mitsu Lev Goldfarb wrote: > On Thu, 13 Aug 1998, Mitsu Hadeishi wrote: > > > I'm not quite sure I agree with your analysis. Since I haven't looked at it in > > great detail, I present this as a tentative critique of your presentation. > > > > Since recurrent ANNs can be made to carry out any Turing operation (i.e., modulo > > their finite size, they are Turing equivalent), then unless you are saying that > > your represention cannot be implemented on a Turing machine, then it is clearly > > NOT the case that recurrent ANNs *cannot* learn arbitrary symbolic > > representations. Not having looked at your scheme in detail, of course, I don't > > know whether your scheme somehow is unimplementable even on a Turing machine, but > > it seems to me you must not be claiming this. > > Yes, I'm not claiming this. Moreover, I'm not discussing the SIMULATIONAL, > or computing, power of a learning model, since in a simulation of a Turing > machine one doesn't care how the actual "parameters" are constructed based > on a small training set, i.e. no learning is involved. Many present (and > future) computational models are "Turing equivalent". That doesn't make > them learning models, does it? > > In other words, as I mentioned in my original posting, IF YOU KNOW THE > SYMBOLIC CLASS STRUCTURE of course you can simulate, or encode, it on > many machines (similar quote straight from the horse's mouse i.e. from > Siegelmann and Sontag paper "On the computational power of neural nets" , > section 1.2: "Many other types of 'machines' may be used for > universality"). Again, the discovery of the symbolic class structure is a > fundamentally different matter, much less trivial than just a simple > encoding of this structure using any other, including numeric, structure. > > > You seem to be basing your argument on the notion that the input space to a > > recurrent ANN is a set of numbers, which you interpret as the coordinates of a > > vector. However, this is only a kind of vague analogy, since the field > > operations of the vector space (addition, multiplication, etc.) have no clear > > meaning on the input space. "Adding" two input vectors does not necessarily > > result in anything meaningful except in the sense that the recurrent ANN to be > > useful must be locally stable with respect to small variations in the input. > > However, the actual structure or metric of the input space is in some sense > > determined not a priori but by the state of the recurrent ANN itself, and can > > change over time both as a result of training and as a result of iteration. The > > input space is numbers, yes, but that doesn't make it a vector space. For > > example, what properties of the input would be preserved if I, say, added the > > vector (10^25, 10^25, ...) to the input? If it is a "vector space" then that > > operation would yield something sensible, some symmetries, and yet it obviously > > does not. Thus, while I sympathize with your claim that the vector field of R(n) > > does not admit to the structure necessary to make visible much symbolic > > structure, this in itself does not doom connectionist symbol processing by any > > means. > > It appears that something VERY BASIC is missing from the above > description: How could a recurrent net learn without some metric and, as > far as I know, some metric equivalent to the Euclidean metric? (All "good' > metrics on a finite-dimensional vector space are equivalent to the > Euclidean, see [1] in my original posting.) > > > Your argument does have weight when applied to a single-layer perceptron, which > > is, after all, just a thresholded/distorted linear transformation. Although it > > seemed to take the early connectionist community by surprise, it should be no > > surprise at all that a single-layer perceptron cannot learn the parity problem, > > because obviously the parity problem is not linearly separable, and how could any > > linear discriminator possibly learn a non-linearly-separable problem? However, > > we do not live in a world of single-layer perceptrons. Because networks are more > > complex than this, arguments about the linearity of the input space seem to me > > rather irrelevant. I suspect you mean something else, however. > > Yes, I do. > > > I think the intuitive point you are perhaps trying to make is that symbolic > > representations are arbitrarily nestable (recursively recombinable), and an input > > space which consists of a fixed number of dimensions cannot handle recursive > > combinations. However, one can use time-sequence to get around this problem (as > > we all are doing when we read and write for example). Rather than make our eyes, > > for example, capable of handling arbitrarily recombinable input all at once, we > > sequence the input to our eyes by reading material over time. The same trick can > > be used with recurrent networks for example. > > Mitsu, > > I'm not making just this point. > > The main point I'm making can be stated as follows. Inductive learning > requires some object dissimilarity and/or similarity, measure(s). The > accumulated mathematical experience strongly suggests that the distance in > the input space must be consistent with the underlying operational, or > compositional, structure of the chosen object representation (e.g. > topological group, topological vector space, etc). It turns out that while > the classical vector space (because of the "simple" compositional > structure of its objects) allows essentially one metric consistent with > the underlying algebraic structure [1], each symbolic "space" (e.g. > strings, trees, graphs) allows infinitely many of them. In the latter > case, the inductive learning becomes the learning of the corresponding > class distance function (refer to my parity example). Moreover, since > some noise is always present in the training set, I cannot imagine how > RELIABLE symbolic inductive class structure can be learned from a SMALL > training set without the right symbolic bias and without the help of the > corresponding symbolic distance measures. > > Cheers, > Lev From henders at linc.cis.upenn.edu Fri Aug 14 17:01:06 1998 From: henders at linc.cis.upenn.edu (Jamie Henderson) Date: Fri, 14 Aug 1998 17:01:06 -0400 (EDT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> (Dave_Touretzky@cs.cmu.edu) Message-ID: <199808142101.RAA16445@linc.cis.upenn.edu> Dave Touretzky writes: >I'd like to start a debate on the current state of connectionist >symbol processing? Is it dead? Or does progress continue? .. >The problem, though, was that we >did not have good techniques for dealing with structured information >in distributed form, or for doing tasks that require variable binding. >While it is possible to do these things with a connectionist network, >the result is a complex kludge that, at best, sort of works for small >problems, but offers no distinct advantages over a purely symbolic >implementation. The cases where people had shown interesting >generalization behavior in connectionist nets involved simple >vector-based representations, without nested structures or variable >binding. I just gave a paper at the COLING-ACL'98 conference, which is the main international conference for Computational Linguistics. The paper is on learning to do syntactic parsing using a connectionist architecture that extends SRNs with Temporal Synchrony Variable Binding (ala SHRUTI). This architecture does generalize in a structural way, with variable binding. Crucially, the paper evaluates this learning method on a real corpus of naturally occurring text, and gets results that approach the state of the art in the field (which is all statistical methods these days). I received a surprisingly positive response to this paper. I got comments like "I've never taken connectionist NLP seriously, but you're playing the same game as us". "The game" is training and testing on large corpora of real text, not toy domains. The winner is the method with the lowest error rate. I see three morals in this: - Connectionist approaches to processing structural information have made significant progress, to the point that they can now be justified on purely empirical/engineering grounds. - Connectionist methods do solve problems that current non-connectionist methods have (ad-hoc independence assumptions, sparse data, etc.), and people working in learning know it. - Connectionist NLP researchers should be using modern empirical methods, and they will be taken seriously if they do. The paper is available from my web page (http://www.dcs.ex.ac.uk/~jamie/). Below is the reference and abstract. - Jamie Henderson Henderson, J. and Lane, P. (1998) A Connectionist Architecture for Learning to Parse. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, University of Montreal, Canada. Abstract: We present a connectionist architecture and demonstrate that it can learn syntactic parsing from a corpus of parsed text. The architecture can represent syntactic constituents, and can learn generalizations over syntactic constituents, thereby addressing the sparse data problems of previous connectionist architectures. We apply these Simple Synchrony Networks to mapping sequences of word tags to parse trees. After training on parsed samples of the Brown Corpus, the networks achieve precision and recall on constituents that approaches that of statistical methods for this task. (7 pages) ------------------------------- Dr James Henderson Department of Computer Science University of Exeter Exeter EX4 4PT, U.K. http://www.dcs.ex.ac.uk/~jamie/ jamie at dcs.ex.ac.uk ------------------------------- From arbib at pollux.usc.edu Fri Aug 14 18:07:20 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Fri, 14 Aug 1998 14:07:20 -0800 Subject: What have neural networks achieved? Message-ID: Recently, Stuart Russell addressed the following query to Fellows of the AAAI: > This Saturday there will be a debate with John McCarthy, David Israel, > Stuart Dreyfus and myself on the topic of > "How is the quest for artificial intelligence progressing?" > This is widely publicized, likely to be partially televised, > and will be attended by a lot of journalists. > > For this, and for AAAI's future reference, I'd like to collect > convincing examples of progress, particularly examples that will > convince journalists and the general public. For now all I need > is a URL or other accessible pointer and a one or two sentence > description. (It does not *necessarily* have to be your own work!) > Pictures would be very helpful. This spurs me as I work on the 2nd edition of the Handbook of Brain Theory and Neural Networks (due out in 2 years or so; MIT Press has just issued a paperback of the first edition) to pose to you two related questions: a) What are the "big success stories" (i.e., of the kind the general public could understand) for neural networks contributing to the understanding of "real" brains, i.e., within the fields of cognitive science and neuroscience. b) What are the "big success stories" (i.e., of the kind the general public could understand) for neural networks contributing to the construction of "artificial" brains, i.e., successfully fielded applications of NN hardware and software that have had a major commercial or other impact? ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From goldfarb at unb.ca Sat Aug 15 19:29:10 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sat, 15 Aug 1998 20:29:10 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D48576.7EFA0648@ministryofthought.com> Message-ID: On Fri, 14 Aug 1998, Mitsu Hadeishi wrote: > Lev, > > Okay, so we agree on the following: > > Recurrent ANNs have the computational power required. The only > thing at issue is the learning algortihm. "The ONLY thing at issue" IS the MAIN thing at issue, because the simulation of the Turing machine is just a clever game, while an adequate model of inductive learning should, among many other things, change our understanding of what science is (see, for example, Alexander Bird, Philosophy of Science, McGill-Queen's University Press, 1998). > >How could a recurrent net learn without some metric and, as > >far as I know, some metric equivalent to the Euclidean metric? > > Your arguments so far seem to be focusing on the "metric" on the > input space, but this does not in itself mean anything at all about > the metric of the learning algorithm as a whole. What does it mean "the metric of the learning algorithm as a whole"? There is no such a concept as "the metric of the learning algorithm as a whole". > Clearly the input > space is NOT a vector space in the usual sense of the word, at least > if you use the metric which is defined by the whole system (learning > algortihm, error measure, state of the network). If "the input space is NOT a vector space in the usual sense of the word", then what is it? Are we talking about the formal concepts known in mathematics or we don't care about such "trifle" things at all? Remember, that "even" physicists care about such things, and I said "even", because to model the inductive learning we will need more abstract models. > So, what you are now > saying is that the metric must be equivalent (rather than equal to) to > the Euclidean metric: you do not define what you mean by this. [metrics are equivalent if they induce the same topology, or the same convergence] > The learning "metric" in the connectionist paradigm changes over > time: it is a function of the structure of the learning algorithm and > the state of the network, as I mentioned above. The only sense in > which the metric is "equivalent" to the Euclidean metric is locally; > that is, due to the need to discriminate noise, this metric must be > locally stable, thus there is an open neighborhood around most points > in the topological input space for which the "metric" vanishes. > However, the metric can be quite complex, it can have singularities, > it can change over time, it can fold back onto itself, etc. > > This local stability may not be of interest, however, since the > input may be coded so that each discrete possibility is coded as exact > numbers which are separated in space. In this case the input space > may not be a continuous space at all, but a discrete lattice or > something else. If the input space is a lattice, then there are no > small open neighborhoods around the input points, and thus even this > similaity to the Euclidean metric no longer applies. At least, so > far, your arguments do not seem to show anything beyond this. > > >It turns out that while > >the classical vector space (because of the "simple" compositional > >structure of its objects) allows essentially one metric consistent with > >the underlying algebraic structure [1], each symbolic "space" (e.g. > >strings, trees, graphs) allows infinitely many of them. > > Recurrent networks spread the representation of a compound symbol > over time; thus, you can present a string of symbols to a recurrent > network and its internal state will change. You have not shown, it > seems to me, that in this case the learning metric would look anything > like a Euclidean metric, or that there would be only "one" such > metric. In fact it seems obvious to me that this would NOT be the > case. I would like to hear why you might disagree. Mitsu, Forgive me for the analogy, but from the above as well as from other published sources, it appears to me that in the "connectionist symbol processing", by throwing into one model two, I strongly suggest, INCOMPATIBLE ingredients (vector space model and the symbolic operations) one hopes to prepare a magic soup for inductive learning. I strongly believe that this is not a scientifically fruitful approach. Why? Can I give you a one sentence answer? If you look very carefully at the topologies induced on the set of strings (over an alphabet of size > 1) by various symbolic distances (of type given in the parity class problem), then you will discover that they have hardly anything to do with the continuous topologies we are used to from the classical mathematics. In this sense, the difficulties ANNs have with the parity problem are only the tip of the iceberg. So, isn't it scientifically more profitable to work DIRECTLY with the symbolic topologies, i.e. the symbolic distance functions, by starting with some initial set of symbolic operations and then proceeding in a systematic manner to seek the optimal topology (i.e. the optimal set of weighted operations) for the training set. To simplify things, this is what the evolving transformation system model we are developing attempts to do. It appears that there are profound connections between the relevant symbolic topologies (and hardly any connections with the classical numeric topologies). Based on those connections, we are developing an efficient inductive learning model that will work with MUCH SMALLER training set than has been the case in the past. The latter is possible due to the fact that, typically, computation of the distance between two strings involves many operations and the optimization function involves O(n*n) interdistances, where n is the size of the training set. Cheers, Lev From mitsu at ministryofthought.com Sat Aug 15 20:22:52 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 17:22:52 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D6265C.D637A673@ministryofthought.com> Lev Goldfarb wrote: > On Fri, 14 Aug 1998, Mitsu Hadeishi wrote: > > Your arguments so far seem to be focusing on the "metric" on the > > input space, but this does not in itself mean anything at all about > > the metric of the learning algorithm as a whole. > > What does it mean "the metric of the learning algorithm as a whole"? > There is no such a concept as "the metric of the learning algorithm as a > whole". Since you are using terms like "metric" extremely loosely, I was also doing so. What I mean here is that with certain connectionist schemes, for example those that use an error function of some kind, one could conceive of the error measure as a kind of distance function (however, it is not a metric formally speaking). However, the error measure is far more useful and important than the "metric" you might impose on the input space when conceiving of it as a vector space, since the input space is NOT a vector space. > If "the input space is NOT a vector space in the usual sense of the word", > then what is it? Are we talking about the formal concepts known in > mathematics or we don't care about such "trifle" things at all? > Remember, that "even" physicists care about such things, and I said > "even", because to model the inductive learning we will need more > abstract models. As you (should) know, a vector space is supposed to have vector field symmetries. For example, something should be preserved under rotations and translations of the input vectors. However, what do you get when you do arbitrary rotations of the input to a connectionist network? I don't mean rotations of, say, the visual field to a pattern recognition network, but rather taking the actual values of the inputs to each neuron in a network as coordinates to a vector, and then "rotating" them or translating them, or both. What meaning does this have when used with a recurrent connectionist architecture? It seems to me that it has very little meaning if any. > [metrics are equivalent if they induce the same topology, or the same > convergence] Again, the only really important function is the structure of the error function, not the "metric" on the input space conceived as a vector space, and it isn't even a metric in the usual sense of the word. > Can I give you a one sentence answer? If you look very carefully at the > topologies induced on the set of strings (over an alphabet of size > 1) by > various symbolic distances (of type given in the parity class problem), > then you will discover that they have hardly anything to do with the > continuous topologies we are used to from the classical mathematics. In > this sense, the difficulties ANNs have with the parity problem are only > the tip of the iceberg. I do not dispute the value of your work, I simply dispute the fact that you seem to think it dooms connectionist approaches, because your intuitive arguments against connectionist approaches are not cogent it seems to me. While your work is probably quite valuable, and I think I understand what you are getting at, I see no reason why what you are talking about would prevent a connectionist approach (based on a recurrent or more sophisticated architecture) from being able to discover the same symbolic metric---because, as I say, the input space is not in any meaningful sense a vector space, and the recurrent architecture allows the "metric" of the learning algorithm, it seems to me, to acquire precisely the kind of structure that you need it to---or, at least, I do not see in principle why it cannot. The reason this is so is again because the input is spread out over multiple presentations to the network. There are good reasons to use connectionist schemes, however, I believe, as opposed to purely symbolic schemes. For one: symbolic techniques are inevitably limited to highly discrete representations, whereas connectionist architectures can at least in theory combine both discrete and continuous representations. Two, it may be that the simplest or most efficient representation of a given set of rules may include both a continous and a discrete component; that is, for example, considering issues such as imprecise application of rules, or breaking of rules, and so forth. For example, consider poetic speech; the "rules" for interpreting poetry are clearly not easily enumerable, yet human beings can read poetry and get something out of it. A purely symbolic approach may not be able to easily capture this, whereas it seems to me a connectionist approach has a better chance of dealing with this kind of situation. I can see value in your approach, and things that connectionists can learn from it, but I do not see that it dooms connectionism by any means. Mitsu > > > So, isn't it scientifically more profitable to work DIRECTLY with the > symbolic topologies, i.e. the symbolic distance functions, by starting > with some initial set of symbolic operations and then proceeding in a > systematic manner to seek the optimal topology (i.e. the optimal set of > weighted operations) for the training set. To simplify things, this is > what the evolving transformation system model we are developing attempts > to do. It appears that there are profound connections between the relevant > symbolic topologies (and hardly any connections with the classical numeric > topologies). Based on those connections, we are developing an efficient > inductive learning model that will work with MUCH SMALLER training set > than has been the case in the past. The latter is possible due to the fact > that, typically, computation of the distance between two strings involves > many operations and the optimization function involves O(n*n) > interdistances, where n is the size of the training set. > > Cheers, > Lev From mitsu at ministryofthought.com Sat Aug 15 20:37:06 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 17:37:06 -0700 Subject: Connectionist symbol processing: any progress? References: <35D6265C.D637A673@ministryofthought.com> Message-ID: <35D629B2.83598619@ministryofthought.com> Mitsu Hadeishi wrote: > Lev Goldfarb wrote: > > However, the error measure is far more useful and important than > the "metric" you might impose on the input space when conceiving of it as a > vector space, since the input space is NOT a vector space. Clarification: I really should say you do not have to conceive of the input space as a vector space. It may in fact behave like a vector space (locally) if the architecture of the network, the nature of the learning algorithm, and the training sets are structured in a particular way. However, it will not necessarily behave this way as the network evolves---and particularly if you conceive of the input space as spread out through time for a recurrent network, the notion of it as a vector space doesn't work at all. The main point is that it is the feedback mechanism (error function or other mechanism) which is truly important when considering how the learning algorithm is biased and will evolve, not the "metric" on the initial input space. Mitsu From goldfarb at unb.ca Sat Aug 15 22:32:12 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sat, 15 Aug 1998 23:32:12 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D6265C.D637A673@ministryofthought.com> Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > Since you are using terms like "metric" extremely loosely, I was also doing > so. Please, note that although I'm not that precise, I have not used the "terms like 'metric' extremely loosely". > What I mean here is that with certain connectionist schemes, for example > those that use an error function of some kind, one could conceive of the error > measure as a kind of distance function (however, it is not a metric formally > speaking). However, the error measure is far more useful and important than > the "metric" you might impose on the input space when conceiving of it as a > vector space, since the input space is NOT a vector space. If the input and "the intermediate" spaces are not vector spaces, then what is the advantage of the "connectionist" architectures? > > If "the input space is NOT a vector space in the usual sense of the word", > > then what is it? Are we talking about the formal concepts known in > > mathematics or we don't care about such "trifle" things at all? > > Remember, that "even" physicists care about such things, and I said > > "even", because to model the inductive learning we will need more > > abstract models. > > As you (should) know, a vector space is supposed to have vector field > symmetries. For example, something should be preserved under rotations and > translations of the input vectors. However, what do you get when you do > arbitrary rotations of the input to a connectionist network? I don't mean > rotations of, say, the visual field to a pattern recognition network, but > rather taking the actual values of the inputs to each neuron in a network as > coordinates to a vector, and then "rotating" them or translating them, or > both. What meaning does this have when used with a recurrent connectionist > architecture? It seems to me that it has very little meaning if any. > > > [metrics are equivalent if they induce the same topology, or the same > > convergence] > > Again, the only really important function is the structure of the error > function, not the "metric" on the input space conceived as a vector space, and > it isn't even a metric in the usual sense of the word. > > > Can I give you a one sentence answer? If you look very carefully at the > > topologies induced on the set of strings (over an alphabet of size > 1) by > > various symbolic distances (of type given in the parity class problem), > > then you will discover that they have hardly anything to do with the > > continuous topologies we are used to from the classical mathematics. In > > this sense, the difficulties ANNs have with the parity problem are only > > the tip of the iceberg. > > I do not dispute the value of your work, I simply dispute the fact that you > seem to think it dooms connectionist approaches, because your intuitive > arguments against connectionist approaches are not cogent it seems to me. > While your work is probably quite valuable, and I think I understand what you > are getting at, I see no reason why what you are talking about would prevent a > connectionist approach (based on a recurrent or more sophisticated > architecture) from being able to discover the same symbolic metric---because, > as I say, the input space is not in any meaningful sense a vector space, and > the recurrent architecture allows the "metric" of the learning algorithm, it > seems to me, to acquire precisely the kind of structure that you need it > to---or, at least, I do not see in principle why it cannot. The reason this > is so is again because the input is spread out over multiple presentations to > the network. > > There are good reasons to use connectionist schemes, however, I believe, as > opposed to purely symbolic schemes. For one: symbolic techniques are > inevitably limited to highly discrete representations, whereas connectionist > architectures can at least in theory combine both discrete and continuous > representations. The main reason we are developing the ETS model is precisely related to the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS as opposed to the unnatural ones. I would definitely say (and you would probably agree) that (if, indeed, this is the case) it is the most important consideration. Moreover, it turns out that the concept of a fuzzy set, which was originally introduced in a rather artificial manner that didn't clarify the underlying source of fuzziness (and this have caused an understandable and substantial resistance to its introduction), emerges VERY naturally within the ETS model: the definition of the class via the corresponding distance function typically and naturally induces the fuzzy class boundary and also reveals the source of fuzziness, which includes the interplay between the corresponding weighted operations and (in the case of noise in the training set) a nonzero radius. Note that in the parity class problem, the parity class is not fuzzy, as reflected in the corresponding weighting scheme and the radius of 0. > Two, it may be that the simplest or most efficient > representation of a given set of rules may include both a continous and a > discrete component; that is, for example, considering issues such as imprecise > application of rules, or breaking of rules, and so forth. For example, > consider poetic speech; the "rules" for interpreting poetry are clearly not > easily enumerable, yet human beings can read poetry and get something out of > it. A purely symbolic approach may not be able to easily capture this, > whereas it seems to me a connectionist approach has a better chance of dealing > with this kind of situation. > > I can see value in your approach, and things that connectionists can learn > from it, but I do not see that it dooms connectionism by any means. See the previous comment. Cheers, Lev From mitsu at ministryofthought.com Sat Aug 15 23:47:28 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sat, 15 Aug 1998 20:47:28 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D65650.3B387A0D@ministryofthought.com> Lev Goldfarb wrote: > On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > > Since you are using terms like "metric" extremely loosely, I was also doing > > so. > > Please, note that although I'm not that precise, I have not used the > "terms like 'metric' extremely loosely". I am referring to this statement: >How could a recurrent net learn without some metric and, as >far as I know, some metric equivalent to the Euclidean metric?Here you are talking about the input space as though the Euclidean metric on that space is particularly key, when it is rather the structure of the whole network, the feedback scheme, the definition of the error measure, the learning algortihm, and so forth which actually create the relevant and important mathematical structure. In a sufficiently complex network, you can pretty much get any arbitrary map you like from the input space to the output, and the error measure is biased by the specific nature of the training set (for example), and is measured on the output of the network AFTER it has gone through what amounts to an arbitrary differentiable transformation. By this time, the "metric" on the original input space can be all but destroyed. Add recurrency and you even get rid of the fixed dimensionality of the input space. In the quote above, it appears you are implying that there is some direct relationship between the metric on the initial input space and the operation of the learning algorithm. I do not see how this is the case. > The main reason we are developing the ETS model is precisely related to > the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and > fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS > as opposed to the unnatural ones. I would definitely say (and you would > probably agree) that (if, indeed, this is the case) it is the most > important consideration. > > Moreover, it turns out that the concept of a fuzzy set, which was > originally introduced in a rather artificial manner that didn't clarify > the underlying source of fuzziness (and this have caused an understandable > and substantial resistance to its introduction), emerges VERY naturally > within the ETS model: the definition of the class via the corresponding > distance function typically and naturally induces the fuzzy class boundary > and also reveals the source of fuzziness, which includes the interplay > between the corresponding weighted operations and (in the case of noise in > the training set) a nonzero radius. Note that in the parity class problem, > the parity class is not fuzzy, as reflected in the corresponding weighting > scheme and the radius of 0. Well, what one mathematician calls natural and the other calls artificial may be somewhat subject to taste as well as rational argument. At this point one can get into the realm of mathematical aesthetics or philosophy rather than hard science. From Tony.Plate at MCS.VUW.AC.NZ Sun Aug 16 04:30:59 1998 From: Tony.Plate at MCS.VUW.AC.NZ (Tony Plate) Date: Sun, 16 Aug 1998 20:30:59 +1200 Subject: Connectionist symbol processing: any progress? In-Reply-To: Your message of "Tue, 11 Aug 1998 03:34:27 -0400." <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> Work has been progressing on higher-level connectionist processing, but progress has not been blindingly fast. As others have noted, it is a difficult area. One of things that has recently renewed my interest in the idea of using distributed representations for processing complex information was finding out about Latent Semantic Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method for taking a large corpus of text and constructing vector representations for words in such a way that similar words are represented by similar vectors. LSA works by representing a word by its context (harkenning back to a comment I recently saw attributed to Firth 1957: "You shall know a word by the company it keeps" :-), and then reducing the dimensionality of the context using singular value decomposition (SVD) (v. closely related to principal component analysis (PCA)). The vectors constructed by LSA can be of any size, but it seems that moderately high dimensions work best: 100 to 300 elements. It turns out that one can do all sorts of surprising things with these vectors. One can construct vectors which represent documents and queries by merely summing the vectors for their words and do information retrieval, automatically getting around the problem of synonyms (since synonyms tend to have similar vectors). One can do the same thing with questions and multiple choice answers and pass exams (e.g., first year psychology exams, TOEFL tests). And all this just treating texts as unordered bags of words. While these results are intriguing, they don't achieve the goal of complex connectionist reasoning. However, they could provide an excellent source of representations for use in a more complex connectionist system (using connectionist in a very broad sense here). LSA is fast enough that it can be used on 10s of thousands of documents to derive vectors for thousands of words. This is exiting because it could allow one to start building connectionist systems which deal with full-range vocabularies and large varied task sets (as in info. retrieval and related tasks), and which do more interesting processing than just forming the bag-of-words content of a document a la vanilla-LSA. As Ross Gayler mentioned, analogy processing is a very promising area for application of connectionist ideas. There are a few reasons for this being interesting: people do it all the time, structural relationships are important to the task, no explicit variables need be involved, and rule-based reasoning can be seen as a very specialized version of the task. One very interesting model of analogical processing that was presented at the workshop in Bulgaria (in July) was John Hummel and Kieth Holyoak's LISA model (ref at end). This model uses distributed representations for roles and fillers, binding them together with temporal synchrony, and achieves quite impressive results (John, in case you're listening, this is not to say that I think temporal binding is the right way to go, but it's an impressive model and presents a good challenge to other approaches.) I have to disagree with two of the technical comments made in this discussion: Jerry Feldman wrote: "Parallel Distributed Processing is a contradiction in terms. To the extent that representing a concept involves all of the units in a system, only one concept can be active at a time." One can easily represent more than one concept at a time in distributed representations. One of their beauties is the soft limit on the number of concepts that can be represented at once. This limit depends on the dimensionality of the system, the redundancy in representations, the similarity structure of the concepts, and so forth. All of the units in the system might be involved in representing a concept, but redundancy makes none essential. And of course one can also have different modules within a system. But, my point is that even within a single PDP module, one can still represent (and process) multiple concepts at once. Mitsu Hadeishi wrote: "an input space which consists of a fixed number of dimensions cannot handle recursive combinations" A number of people, including myself, have shown that it is possible to represent arbitrarily nested concepts in space with a fixed number of dimensions. Furthermore, the resulting representations have interesting and useful properties not shared by their symbolic counterparts. Very briefly, the way one can do this is by using vector-space operations for addition and multiplication to implement the conceptual operations of forming collections and binding concepts, respectively. For example, one can build a distributed representation for a shape configuration#33 of "circle above triangle" as: config33 = vertical + circle + triangle + ontop*circle + below*triangle By using an appropriate multiplication operation (I used circular, or wrapped, convolution), the reduced representation of the compositional concept (e.g., config33) has the same dimension as its components, and can readily be used as a component in other higher-level relations. Quite a few people have devised schemes for this type of representation, e.g., Paul Smolensky (Tensor Products), Jordan Pollack (RAAMs), Allesandro Sperduti (LRAAMs), Pentti Kanerva (Binary Spatter Codes). Another related scheme that uses distributed representations and tensor product bindings (but not role-filler bindings) is Halford, Wilson and Philips STAR model. Some of the useful properties that of these types of distributed representations are as follows: (a) The reduced, distributed representation (e.g., config33) functions like a pointer, but is more that a mere pointer in that information about it contents is available directly without having to "follow" the "pointer." This makes it possible to do some types processing without having to unpack the structures. (b) The vector-space similarity of representations (i.e., the dot-product) reflects both superficial and structural similarity of structures. (c) There are fast, approximate, vector-space techniques for doing "structural" computations like finding corresponding objects in two analogies, or doing structural transformations. Some references: (Lots of LSA-related papers at: http://lsa.colorado.edu/ http://superbook.bellcore.com/~std/LSI.papers.html ) @article{deerwester-dumais-landauer-furnas-harshman-90, author = "S. Deerwester and S. T. Dumais and T. K. Landauer and G. W. Furnas and R. A. Harshman", year = "1990", title = "Indexing by latent semantic analysis", journal = "Journal of the Society for Information Science", volume = "41", number = "6", pages = "391-407", annote = "first technical LSI paper; good background." } @inproceedings{landauer-laham-foltz-98, author = "T. K. Landauer and D. Laham and P. W. Foltz", title = "Learning Human-like Knowledge with Singular Value Decomposition: A Progress Report", booktitle = "Neural Information Processing Systems (NIPS*97)", year = "1998" } @article{landauer-dumais-97, author = "T. K. Landauer and S. T. Dumais", year = "1997", title = "Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge", journal = "Psychological Review", pages = "211-240", volume = "104", number = "2" } @inproceedings{bartell-cottrell-belew-92, author = "B.T. Bartell and G.W. Cottrell and R.K. Belew", year = "1992", title = "{Latent Semantic Indexing} is an optimal special case of multidimensional scaling", booktitle = "Proc SIGIR-92", publisher = "ACM Press", address = "New York" } @article{hummel-holyoak-97, author = "J. E. Hummel and K. J. Holyoak", title = "Distributed representations of structure: {A} theory of analogical access and mapping", journal = "Psychological Review", year = 1997, volume = 104, number = 3, pages = "427--466", annote = "LISA paper" } @inproceedings{kanerva-96, author = "P. Kanerva", year = 1996, title = "Binary spatter-coding of ordered K-tuples", volume = 1112, pages = "869-873", publisher = "Springer", editor = "C. von der Malsburg and W. von Seelen and J.C. Vorbruggen and B. Sendhoff", booktitle = "Artificial Neural Networks--ICANN Proceedings", series = "Lecture Notes in Computer Science", address = "Berlin", keywords = "HRRs, distributed representations" } @unpublished{halford-wilson-phillips-bbs98, author = "Halford, Graeme and Wilson, William H. and Phillips, Steven", title = "Processing Capacity Defined by Relational Complexity: Implications for Comparative, Developmental, and Cognitive Psychology", note = "Behavioral and Brain Sciences", year = "to appear" } @InBook{plate-97c, author = "Tony A. Plate", chapter = "A Common Framework for Distributed Representation Schemes for Compositional Structure", title = "Connectionist Systems for Knowledge Representation and Deduction", publisher = "Queensland University of Technology", year = "1997", editor = "Fr\'ed\'eric Maire and Ross Hayward and Joachim Diederich", pages = "15-34" } @incollection{plate-98, author = "Tony Plate", title = "Analogy retrieval and processing with distributed represenations", year = "1998", booktitle = "Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences", pages = "154--163", editor = "Keith Holyoak and Dedre Gentner and Boicho Kokinov", publisher = "NBU Series in Cognitive Science, New Bugarian University, Sofia." } Tony Plate, Computer Science Voice: +64-4-495-5233 ext 8578 School of Mathematical and Computing Sciences Fax: +64-4-495-5232 Victoria University, PO Box 600, Wellington, New Zealand tap at mcs.vuw.ac.nz http://www.mcs.vuw.ac.nz/~tap From bryan at cog-tech.com Sun Aug 16 10:18:49 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 10:18:49 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808161418.KAA30062@cti2.cog-tech.com> Lev wrote: > Can I give you a one sentence answer? If you look very carefully at > the topologies induced on the set of strings (over an alphabet of > size > 1) by various symbolic distances (of type given in the parity > class problem), then you will discover that they have hardly > anything to do with the continuous topologies we are used to from > the classical mathematics. In this sense, the difficulties ANNs have > with the parity problem are only the tip of the iceberg. Mitsu wrote: > I see no reason why what you are talking about would prevent a > connectionist approach (based on a recurrent or more sophisticated > architecture) from being able to discover the same symbolic > metric---because, as I say, the input space is not in any meaningful > sense a vector space, and the recurrent architecture allows the > "metric" of the learning algorithm, it seems to me, to acquire > precisely the kind of structure that you need it to---or, at least, > I do not see in principle why it cannot. The reason this is so is > again because the input is spread out over multiple presentations to > the network. > There are good reasons to use connectionist schemes, however, I > believe, as opposed to purely symbolic schemes. For one: symbolic > techniques are inevitably limited to highly discrete > representations, whereas connectionist architectures can at least in > theory combine both discrete and continuous representations. "Connectionist" is too broad a term to distinguish inherently symbolic from approaches which are not inherently symbolic, but which have yet to be clearly excluded from being able to induce approximately symbolic processing solutions. In an attempt to characterize these two approaches, the one builds in symbolic processing structure (this is certainly true for Shruti and, from reading Lev's messages, appears to be true of that research as well), while the other intends to utilize a "recurrent or more sophisticated architecture" to induce the desired behavior without "special" mechanisms. It is certainly true that we have the ability to, and, of necessity, must, construct connectionist systems with different inductive biases. A recurrent MLP (multi-layer-perceptron) *typically* builds in scalar weights, sigmoid transfer functions, high-forward connectivity, recurrent connections, etc. Simultaneous recurrent networks are similar, but build in a settling process by which an output/behavior is computed. In the work with Shruti, we have built into a simultaneous recurrent network localized structure and transfer functions which facilitate "symbolic" processing. While such specialized structure does not preclude using, e.g., backpropagation for learning, it also opens up explicit search of the structure space by methods more similar to evolutionary programming. My point, here, is not that we have the "right" solution, but that the architectural variations which are being discussed need not be exclusive. Given a focus on "symbolic" processing, I suggest that there are two issues which have dominated this discussion: - What inductive biases should be built into connectionist architectures for this class of problems? This question should include choices of "structure", and "learning rules". - What meaningful differences exist in the learned behavior of systems with different inductive biases. Especially, questions of rigidity and generalization of the solutions, the efficiency of learning, and the preservation of plasticity seem important. I feel that Lev is concerned that learning algorithms using recurrent networks with distributed representations have an inductive bias which limits their practical capacity to induce solutions (internal representations / transforms) for domains in which symbol processing is a critical. I agree with this "intuitively," but I would like to see a firmer characterization of why such networks are ill-suited for "symbolic processing" (quoted to indicate that good solutions need not be purely symbolic and could exhibit aspects of more classical ANNs). I am thinking about an effort several years ago which was made to characterize problems (and representations) which were "GA" hard -- this is, which were ill suited to the transforms and inductive biases of (certain classes of) genetic algorithms. A similar effort with various classes of connectionist architectures would be quite useful in moving beyond such "intuitive" senses of the fitness of different approaches and the expected utility of research in different connectionist solutions for different classes of problems. I feel that it is a reasonable argument that evolution has facilitated us with both gross and localized structure. That includes the body and the brain. Within the "brain" there are clearly systems that are structurally (pre-)disposed for different kinds of computing, witness the cerebellum vs the cerebral cortex. We do not need to, and should not, make the same structural choices for connectionist solutions for different classes of problems. My own intuitive "argument" leads me to believe that distributed connectionist solutions are unlikely to prove suitable for symbolic processing. Recurrent, and simultaneous recurrent, distributed networks may posses the representational capacity, but I maintain doubts concerning their inductive capacity for "symbolic" domains. Perhaps a fruitful approach would be to enumerate characteristics of a system which facilitate learning and behavior in domains which are considered "symbolic" (including variable binding, appropriate generalization, plasticity, etc.), and to see how those properties might be realized or approximated within the temporal dynamics of a class of distributed recurrent networks. This effort must, of course, not seek to allocate too much responsibility to single system and therefore, needs be part of a broader theory of the structure of mind and organism. If we consider that the primary mechanism of recurrence in a distributed representations as enfolding space into time, I still have reservations about the complexity that the agent / organism faces in learning an enfolding of mechanisms sufficient to support symbolic processing. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From rsun at research.nj.nec.com Sun Aug 16 19:24:01 1998 From: rsun at research.nj.nec.com (Ron Sun) Date: Sun, 16 Aug 1998 19:24:01 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808162324.TAA07433@pc-rsun.nj.nec.com> Along the line of Tony's summary of work on distributed connectionist models, here is my (possibly biased) summary of the state of the art of localist connectionist symbolic processing work. There have been a variety of work in developing LOCALIST connectionist models for symbolic processing, as pointed out by postings of Jerry Feldman and Shastri. The work spans a large spectrum of application areas in AI and cognitive science. Although it has been discussed somewhat, a more detailed list of work in this area include: ------------------------ REASONING (commonsense reasoning, logic reasoning, case-based reasoning, reasoning based on schemas/frames) L. Shastri and V. Ajjanagadde (1993). From goldfarb at unb.ca Sun Aug 16 21:20:59 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Sun, 16 Aug 1998 22:20:59 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <35D65650.3B387A0D@ministryofthought.com> Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > Lev Goldfarb wrote: > > > On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > > > > Since you are using terms like "metric" extremely loosely, I was also doing > > > so. > > > > Please, note that although I'm not that precise, I have not used the > > "terms like 'metric' extremely loosely". > > I am referring to this statement: > > >How could a recurrent net learn without some metric and, as > >far as I know, some metric equivalent to the Euclidean metric?Here you are talking > about the input space as though the Euclidean metric on that space is particularly > key, when it is rather the structure of the whole network, the feedback scheme, the > definition of the error measure, the learning algortihm, and so forth which actually > create the relevant and important mathematical structure. Mitsu, I'm afraid, I failed to see what is wrong with my (quoted) question. First, I suggested in it that to do inductive learning properly ONE MUST HAVE AN EXPLICIT AND MEANINGFUL DISTANCE FUNCTION ON THE INPUT SPACE. And, second, given the latter plus the "foundations of the connectionism" (e.g. Michael Jordan's chapter 9, in the PDP, vol.1), if, indeed, one wants to use the n-tuple of real numbers as the input representation, then it is very natural to assume (at least for a mathematician) that the input space is a vector space, with the resulting necessity of an essentially unique metric on it (if the metric is consistent with the underlying vector space structure, which is practically a universal assumption in mathematics, see [2] in my first posting). > In a sufficiently complex > network, you can pretty much get any arbitrary map you like from the input space to > the output, and the error measure is biased by the specific nature of the training > set (for example), and is measured on the output of the network AFTER it has gone > through what amounts to an arbitrary differentiable transformation. By this time, > the "metric" on the original input space can be all but destroyed. Add recurrency > and you even get rid of the fixed dimensionality of the input space. In the quote > above, it appears you are implying that there is some direct relationship between > the metric on the initial input space and the operation of the learning algorithm. > I do not see how this is the case. YES, INDEED, I AM STRONGLY SUGGESTING THAT THERE MUST BE A DIRECT CONNECTION "BETWEEN THE METRIC ON THE INITIAL INPUT SPACE AND THE OPERATIONS OF THE LEARNING ALGORITHM". IN OTHER WORDS, THE SET OF CURRENT OPERATIONS ON THE REPRESENTATION SPACE (WHICH, OF COURSE, CAN NOW BE DYNAMICALLY MODIFIED DURING LEARNING) SHOULD ALWAYS BE USED FOR DISTANCE COMPUTATION. What is the point of, first, changing the symbolic representation to the numeric representation, and, then, applying to this numeric representation "very strange", symbolic, operations? I absolutely fail to see the need for such an artificial contortion. > > The main reason we are developing the ETS model is precisely related to > > the fact that we believe it offers THE ONLY ONE POSSIBLE NATURAL (and > > fundamentally new) SYMBIOSIS of the discrete and the continuous FORMALISMS > > as opposed to the unnatural ones. I would definitely say (and you would > > probably agree) that (if, indeed, this is the case) it is the most > > important consideration. > > > > Moreover, it turns out that the concept of a fuzzy set, which was > > originally introduced in a rather artificial manner that didn't clarify > > the underlying source of fuzziness (and this have caused an understandable > > and substantial resistance to its introduction), emerges VERY naturally > > within the ETS model: the definition of the class via the corresponding > > distance function typically and naturally induces the fuzzy class boundary > > and also reveals the source of fuzziness, which includes the interplay > > between the corresponding weighted operations and (in the case of noise in > > the training set) a nonzero radius. Note that in the parity class problem, > > the parity class is not fuzzy, as reflected in the corresponding weighting > > scheme and the radius of 0. > > Well, what one mathematician calls natural and the other calls artificial may be > somewhat subject to taste as well as rational argument. At this point one can get > into the realm of mathematical aesthetics or philosophy rather than hard science. > >From my point of view, symbolic representations can be seen as merely emergent > phenomena or patterns of behavior of physical feedback systems (i.e., looking at > cognition as essentially a bounded feedback system---bounded under normal > conditions, unless the system goes into seizure (explodes mathematically---well, it > is still bounded but it tries to explode!), of course.) From this point of view > both symbols and fuzziness and every other conceptual representation are neither > "true" nor "real" but simply patterns which tend to be, from an > information-theoretic point of view, compact and useful or efficient > representations. But they are built on a physical substrate of a feedback system, > not vice-versa. > > However, it isn't the symbol, fuzzy or not, which is ultimately general, it is the > feedback system, which is ultimately a physical system of course. So, while we may > be convinced that your formalism is very good, this does not mean it is more > fundamentally powerful than a simulation approach. It may be that your formalism is > in fact better for handling symbolic problems, or even problems which require a > mixture of fuzzy and discrete logic, etc., but what about problems which are not > symbolic at all? What about problems which are both symbolic and non-symbolic (not > just fuzzy, but simply not symbolic in any straightforward way?) > > The fact is, intuitively it seems to me that some connectionist approach is bound to > be more general than a more special-purpose approach. This does not necessarily > mean it will be as good or fast or easy to use as a specialized approach, such as > yours. But it is not at all convincing to me that just because the input space to a > connectionist network looks like R(n) in some superficial way, this would imply that > somehow a connectionist model would be incapable of doing symbolic processing, or > even using your model per se. The last paragraphs betray your classical physical bias based on our present (incidentally vector-space based) mathematics. As you can see from my home page, I do not believe in it any more: we believe that the (inductive) symbolic representation is a more basic and much more adequate (evolved during the evolution) form of representation, while the numeric form is a very special case of the latter when the alphabet consists of a single letter. By the way, I'm not the only one to doubt the adequacy of the classical form of representation. For example, here are two quotes from Erwin Schrodinger's book "Science and Humanism" (Cambridge Univ. Press), a substantial part of which is devoted to a popular explication of the following ideas: "The observed facts (about particles and light and all sorts of radiation and their mutual interaction) appear to be REPUGNANT to the classical ideal of continuous description in space and time." "If you envisage the development of physics in THE LAST HALF-CENTURY, you get the impression that the discontinuous aspect of nature has been forced upon us VERY MUCH AGAINST OUR WILL. We seemed to feel quite happy with the continuum. Max Plank was seriously frightened by the idea of a discontinuous exchange of energy . . ." (italics are in the original) Cheers, Lev From mitsu at ministryofthought.com Sun Aug 16 22:03:32 1998 From: mitsu at ministryofthought.com (Mitsu Hadeishi) Date: Sun, 16 Aug 1998 19:03:32 -0700 Subject: Connectionist symbol processing: any progress? References: Message-ID: <35D78F73.FC2C6E08@ministryofthought.com> Lev Goldfarb wrote: > Mitsu, I'm afraid, I failed to see what is wrong with my (quoted) > question. First, I suggested in it that to do inductive learning properly > ONE MUST HAVE AN EXPLICIT AND MEANINGFUL DISTANCE FUNCTION ON THE INPUT > SPACE. The point I am making is simply that after one has transformed the input space, two points which begin "close together" (not infinitesimally close, but just close) may end up far apart and vice versa. The mapping can be degenerate, singular, etc. Why is the metric on the initial space, then, so important, after all these transformations? Distance measured in the input space may have very little correlation with distance in the output space. Also, again, you continue to fail to address the fact that the input may be presented in time sequence (i.e., a series of n-tuples). What about that? In fact the structure of the whole thing may end up looking very much like your symbolic model. > > In a sufficiently complex > > network, you can pretty much get any arbitrary map you like from the input space to > > the output, and the error measure is biased by the specific nature of the training > > set (for example), and is measured on the output of the network AFTER it has gone > > through what amounts to an arbitrary differentiable transformation. By this time, > > the "metric" on the original input space can be all but destroyed. Add recurrency > > and you even get rid of the fixed dimensionality of the input space. In the quote > > above, it appears you are implying that there is some direct relationship between > > the metric on the initial input space and the operation of the learning algorithm. > > I do not see how this is the case. > > YES, INDEED, I AM STRONGLY SUGGESTING THAT THERE MUST BE A DIRECT > CONNECTION "BETWEEN THE METRIC ON THE INITIAL INPUT SPACE AND THE > OPERATIONS OF THE LEARNING ALGORITHM". IN OTHER WORDS, THE SET OF CURRENT > OPERATIONS ON THE REPRESENTATION SPACE (WHICH, OF COURSE, CAN NOW BE > DYNAMICALLY MODIFIED DURING LEARNING) SHOULD ALWAYS BE USED FOR DISTANCE > COMPUTATION. > > What is the point of, first, changing the symbolic representation to the > numeric representation, and, then, applying to this numeric representation > "very strange", symbolic, operations? I absolutely fail to see the need > for such an artificial contortion. If your problem is purely symbolic you may be right, but what if it isn't? (Also: no need to shout.) > > Well, what one mathematician calls natural and the other calls artificial may be > > somewhat subject to taste as well as rational argument. At this point one can get > > into the realm of mathematical aesthetics or philosophy rather than hard science. > > >From my point of view, symbolic representations can be seen as merely emergent > > phenomena or patterns of behavior of physical feedback systems (i.e., looking at > > cognition as essentially a bounded feedback system---bounded under normal > > conditions, unless the system goes into seizure (explodes mathematically---well, it > > is still bounded but it tries to explode!), of course.) From this point of view > > both symbols and fuzziness and every other conceptual representation are neither > > "true" nor "real" but simply patterns which tend to be, from an > > information-theoretic point of view, compact and useful or efficient > > representations. But they are built on a physical substrate of a feedback system, > > not vice-versa. > > > > However, it isn't the symbol, fuzzy or not, which is ultimately general, it is the > > feedback system, which is ultimately a physical system of course. So, while we may > > be convinced that your formalism is very good, this does not mean it is more > > fundamentally powerful than a simulation approach. It may be that your formalism is > > in fact better for handling symbolic problems, or even problems which require a > > mixture of fuzzy and discrete logic, etc., but what about problems which are not > > symbolic at all? What about problems which are both symbolic and non-symbolic (not > > just fuzzy, but simply not symbolic in any straightforward way?) > > > > The fact is, intuitively it seems to me that some connectionist approach is bound to > > be more general than a more special-purpose approach. This does not necessarily > > mean it will be as good or fast or easy to use as a specialized approach, such as > > yours. But it is not at all convincing to me that just because the input space to a > > connectionist network looks like R(n) in some superficial way, this would imply that > > somehow a connectionist model would be incapable of doing symbolic processing, or > > even using your model per se. > > The last paragraphs betray your classical physical bias based on our > present (incidentally vector-space based) mathematics. As you can see from > my home page, I do not believe in it any more: we believe that the > (inductive) symbolic representation is a more basic and much more adequate > (evolved during the evolution) form of representation, while the numeric > form is a very special case of the latter when the alphabet consists of a > single letter. It is quite often possible to describe one representation in terms of another; symbolic in terms of numbers, and vice-versa. What does this prove? You can say numbers are an alphabet with only one letter; I can describe alphabets with numbers, too. The real question is, which representation is natural for any given problem. Obviously symbolic representations have value and are parsimonious for certain problem domains, or they wouldn't have evolved in nature. But to say your discovery, great as it might be, is the only "natural" representation seems rather strange. Clearly, mechanics can be described rather elegantly using numbers, and there are lots of beautiful symmetries and so forth using that description. I am willing to believe other descriptions may be better for other situations, but I do not believe that it is reasonable to say that one can be certain that any given representation is *clearly* more natural than another. It depends on the situation. Symbolic representations have evolved, but so have numeric representations. They have different applications, and you can transform between them. Is one fundamentally "better" than another? Maybe better for this or that problem, but I do not believe it is reasonable to say they are better in some absolute sense. I am a "representation agnostic." I certainly am not going to say that numeric representations are the "only" valid basis, or even that they are foundational (to me that would be incoherent). All representations I believe are kind of stable information points reached as a result of dynamic feedback; in other words, they survive because they have evolutionary value. Whether you call this or that representation "real" or "better" to me is a matter of application and parsimony. The ultimate test is seeing how simple a description of a model is in any given representation. If the description is complex and long, the representation is not efficient; if it is short, it isn't. However, for generality one might choose a less parsimonious representation so you can gain expressive power over a greater range of models. Whether your model is better than connectionist models I do not know, but I do not think it is necessary to think of it as some kind of absolute choice. May the best representation win, as it were (it is a matter of survival of the fittest representation.) Mitsu > By the way, I'm not the only one to doubt the adequacy of the classical > form of representation. For example, here are two quotes from Erwin > Schrodinger's book "Science and Humanism" (Cambridge Univ. Press), a > substantial part of which is devoted to a popular explication of the > following ideas: > > "The observed facts (about particles and light and all sorts of radiation > and their mutual interaction) appear to be REPUGNANT to the classical > ideal of continuous description in space and time." > > "If you envisage the development of physics in THE LAST HALF-CENTURY, you > get the impression that the discontinuous aspect of nature has been forced > upon us VERY MUCH AGAINST OUR WILL. We seemed to feel quite happy with the > continuum. Max Plank was seriously frightened by the idea of a > discontinuous exchange of energy . . ." > > (italics are in the original) > > Cheers, > Lev From bryan at cog-tech.com Sun Aug 16 22:04:25 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 22:04:25 -0400 Subject: Connectionist symbol processing: any progress? Message-ID: <199808170204.WAA32648@cti2.cog-tech.com> Tony Plate's response is interesting and I, for one, will have to give it some thought. I am not certain that > concepts, respectively. For example, one can build a distributed > representation for a shape configuration#33 of "circle above > triangle" as: config33 = vertical + circle + triangle > + > ontop*circle + > below*triangle > > By using an appropriate multiplication > operation (I used > circular, or wrapped, convolution), the > reduced > representation of the compositional concept (e.g., > config33) > has the same dimension as its components, and can > readily be > used as a component in other higher-level relations. > Quite is inherently different from a spatial approach and, hence, a localist approach itself. You need to have enough dimensionality to represent the key features as well as enough to multiply them out by the key relational features -- quite a few dimensions, even if some of that dimensionality is pushed off into numerical precision. It sounds suspiciously like a localist (i.e., locally spatial) encoding. Frankly, I imagine that even a temporal encoding must be localist if it is to show "symbolic processing" behavior. That is, the temporal encoding must be striated with patterned regions that are, themselves, interpretable elements -- compositionality in time vs space. If I am willing to call both temporal encoding and spatial encoding schemes localist, then what would I consider "distributed?" To the extent which this is a meaningful distinction, I would have to say that "distributed" refers to the equi-presence of the encoding of an entity or compositional relation among all elements of the representation, e.g., equally present in all internal variables in a recurrent network. This is perhaps the intent of people who point to "distributed" representations and say that they can only encode a single entity at a time. When such systems are forced to encode compositional representations, they are also forced to develop decidedly non-equal distributions of the information across the elements of the representation. That is, they *must* become localist in time or in space to encode things compositionally. If this line of conjecture is correct, then localist and distributed are simple the ordinate directions on an axis of representation that reflects the compositionality of information, and spatial / temporal are the ordinate directions of an orthogonal axis reflecting how information is encoded within a fixed set of resources. Clearly this sense of distributed vs localist is directly tied to the connectivity of the network and the degree to which weights are global vs local. Another "upside" of localism, however achieved, is that is results in structured credit assignment -- weight or dynamics changes exert only a localized influence on the network behavior and do not disturb unrelated dynamics. My challenge for spatial encoding schemes is that they seem profoundly challenged by metaphor. For example, "Life is like a garden." This saying, when considered, immediately enacts a deep correspondence, an *invariance*, between two different *sets* of systematic relations (each defined over a different set of entities). If relations are spatially encoded, then it is beyond me how such systematic correspondences can be enacted by the dynamic activation of a single new relation. As I consider the ways in which I relate to a garden, the metaphor expands for me systematically parallel ways in which I may relate to life as well. For example, you sow seeds, tend them, and harvest nourishing rewards. The seeds become metaphorical, e.g., as new beginnings, and the parallel yields an interpretation in "life". (Other inferences which can be systematically drawn -- it takes a lot of "fertilizer" to grow anything :} and sometimes I can't tell which is the weed and which is the seedling.) If we allocate spatial encoding to systematic relations, then how can we apply those systematic relations to new semantics -- both "instantly" and without loss of the original interpretations? In fact, our understanding typically grows for both domains illuminated by the metaphor. For me, a temporal (vs spatial) encoding does not help. I would expect a temporal encoding to have developed a topology, upon whose relative stasis the system is equally dependent to draw out meanings. It seems, to me, that another level of indirection may be required to map onto one another such previously distinct systematic relations. On the other hand, perhaps such inferences "by metaphor" are not as automatic as I might believe. It that case it becomes more plausible to see these as a metalevel in which systematic correspondences are established between bindings in the two realms of metaphor. Then, within those binding legitimizing invariances, systematic relations from one domain may readily apply to the other and our directed attention, or wandering gaze, is used to draw out new inferences from within one domain or the other. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From bryan at cog-tech.com Sun Aug 16 22:27:46 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Sun, 16 Aug 1998 22:27:46 -0400 Subject: Structured connectionist architectures In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> (message from Tony Plate on Sun, 16 Aug 1998 20:30:59 +1200) Message-ID: <199808170227.WAA32737@cti2.cog-tech.com> It seems that there are quite a few people working on structured connectionist approaches to symbolic reasoning. I would like to know if anyone has put together a (annotated?) bibliography on such research. On 16Aug98, Tony Plate wrote (was Re: Connectionist symbol processing: any progress?): > One very interesting model of analogical processing that was > presented at the workshop in Bulgaria (in July) was John Hummel and > Kieth Holyoak's LISA model (ref at end). This model uses > distributed representations for roles and fillers, binding them > together with temporal synchrony, and achieves quite impressive > results (John, in case you're listening, this is not to say that I > think temporal binding is the right way to go, but it's an > impressive model and presents a good challenge to other > approaches.) If none exists, I would be more that willing to compile one myself if people will contribute entries / pointers to their own work. --bryan thompson PS: I will be on vacation next week (Aug 17-21) and will be unable to answer any replies until I return. From Tony.Plate at MCS.VUW.AC.NZ Mon Aug 17 06:59:45 1998 From: Tony.Plate at MCS.VUW.AC.NZ (Tony Plate) Date: Mon, 17 Aug 1998 22:59:45 +1200 Subject: Connectionist symbol processing: any progress? In-Reply-To: Your message of "Sun, 16 Aug 1998 22:04:25 -0400." <199808170204.WAA32648@cti2.cog-tech.com> Message-ID: <199808171059.WAA31817@rialto.mcs.vuw.ac.nz> Bryan B. Thompson writes: > ... I am not certain that > > [snip description of my scheme ...] > >is inherently different from a spatial approach and, hence, a localist >approach itself. You need to have enough dimensionality to represent >the key features as well as enough to multiply them out by the key >relational features -- quite a few dimensions [snip ...] > >If I am willing to call both temporal encoding and spatial encoding >schemes localist, then what would I consider "distributed?" To the >extent which this is a meaningful distinction, I would have to say >that "distributed" refers to the equi-presence of the encoding of an >entity or compositional relation among all elements of the >representation, e.g., equally present in all internal variables in a >recurrent network. [snip ...] Actually, Holographic Reduced Representations (HRRs) are an "equi-present" code -- everything represented is represented over all of the units. Suppose you have a vector X which represents some structure. Then you can take just the first half of X, and it will also represent that structure, though it will be noisier. This same property is shared by Kanerva's binary spatter-code and may be shared by some of the codes Ross Gayler has been developing. The dimensionality required is high -- for HRRs it's in the hundreds to thousands of elements. But, HRRs have an interesting scaling property -- toy problems involving a just a couple dozen relations might require a dimensionality of 1000, but the dimensionality doesn't need to increase much (to 2 or 4 thousand) to handle problems involving tens of thousands of relations. Yes, I agree fully that metaphor and analogy are intriguing examples of structural processing, and I believe it could be very fruitful to investigate connectionist processing for them. Tony Plate From FRYRL at f1groups.fsd.jhuapl.edu Mon Aug 17 09:45:07 1998 From: FRYRL at f1groups.fsd.jhuapl.edu (Fry, Robert L.) Date: Mon, 17 Aug 1998 09:45:07 -0400 Subject: FW: Connectionist symbol processing: any progress? Message-ID: On Sat, 15 Aug 1998, Mitsu Hadeishi wrote: > > Lev, > > Okay, so we agree on the following: > > Recurrent ANNs have the computational power required. The only > thing at issue is the learning algortihm. > > Lev Goldfarb responded: >"The ONLY thing at issue" IS the MAIN thing at issue, because the >simulation of the Turing machine is just a clever game, while an adequate >model of inductive learning should, among many other things, change our >understanding of what science is (see, for example, Alexander Bird, >Philosophy of Science, McGill-Queen's University Press, 1998). I have not seen this reference, but will certainly seek it. I certainly agree with it and especially so in the context of connectionists symbolic processing. Consider a computational paradigm where a single- or multiple-neuron layer is viewed as as information channel. It is different, however, from classical Shannon channels in that the neuron transduces information (viz. transmission) from input to an internal representation which is in turn used to select an output code. In a conventional Shannon channel, a channel input code is selected and then inserted into a channel which will degrade this information relative to a receiver that seeks to observe it. That is, one can distinguish between a communications system that effects the transmission of information and a physical system that effects the transuction of information. The engineering objective (as stated by Shannon) was to maximize the entropy of the source and match this to the channel capacity. Alternative, consider a neural computational paradigm where the computational objective is to maximize the information transduced and match this to the output entropy of the neuron. That is, transduction and transmission are complementary processes of information transfer. Information transfer from physical system to physical system requires both. It is interesting that Warren Weaver who co-authored the second chapter in the classic 1949 book "Theory of Communication" recognized this distinction and even made the following statement: "The word communication will be used here in a very broad sense to include all procedures by which one mind may effect another." This is a very interesting choice of words. Why is such a perspective important? Does it provide an unambiguous way of defining learning, symbols, input space, output space, computational objective/metrics, and an inductive theory of neural computation? The neural net community is often at odds with itself regarding having common bases of definitions and interpretations for these terms. After all, regardless of the learning objective function or error criterion, biological and artificial neurons, through learning, must either modify what they measure, e.g., synaptic efficiacies and possibly intradendritic delays, modify what signals they generate for a given input, e.g., variations in firing threshold, or a combination of these. From sgallant at kstream.com Mon Aug 17 12:29:38 1998 From: sgallant at kstream.com (Steve Gallant) Date: Mon, 17 Aug 1998 11:29:38 -0500 Subject: Connectionist symbol processing: any progress? In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> References: Message-ID: <3.0.5.32.19980817112938.008abca0@pluto.kstream.com> In addition to the work on Latent Semantic Indexing mentioned by Tony Plate, there is a body of work involving 'Context Vectors'. This approach was specifically motivated by an attempt to capture semantic information in fixed-length vectors, and is based upon work I did at Northeastern U. A good overview of LSI and Context vectors can be found in: Caid WR, Dumais ST and Gallant SI. Learned vector-space models for document retrieval. Information Processing and Management, Vol. 31, No. 3, pp. 419-429, 1995. and the original source was: Gallant, S. I. A Practical Approach for Representing Context And for Performing Word Sense Disambiguation Using Neural Networks. Neural Computation, Vol. 3, No. 3, 1991, 293-309. Over the last several years HNC Software has further developed and commercialized this approach, forming a division called Aptex. Steve Gallant At 08:30 PM 8/16/98 +1200, Tony Plate wrote: > >Work has been progressing on higher-level connectionist >processing, but progress has not been blindingly fast. As >others have noted, it is a difficult area. > >One of things that has recently renewed my interest in the >idea of using distributed representations for processing >complex information was finding out about Latent Semantic >Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method >for taking a large corpus of text and constructing vector .. Steve Gallant Knowledge Stream Partners 148 State Street Boston, MA 02109 tel: 617/742-2500, x562 fax: 617/742-5820 email: sgallant at kstream.com From cdr at lobimo.rockefeller.edu Mon Aug 17 14:33:32 1998 From: cdr at lobimo.rockefeller.edu (George Reeke) Date: Mon, 17 Aug 1998 14:33:32 -0400 Subject: Connectionist symbol processing: any progress? In-Reply-To: Mitsu Hadeishi "Re: Connectionist symbol processing: any progress?" (Aug 16, 7:03pm) References: <35D78F73.FC2C6E08@ministryofthought.com> Message-ID: <980817143332.ZM11217@grane.rockefeller.edu> On Aug 16, 7:03pm, Mitsu Hadeishi wrote: > The point I am making is simply that after one has transformed the > input space, two points which begin "close together" (not > infinitesimally close, but just close) may end up far apart and vice > versa. The mapping can be degenerate, singular, etc. Why is the > metric on the initial space, then, so important, after all these > transformations? Distance measured in the input space may have very > little correlation with distance in the output space. I can't help stepping in with the following observation: The reason that distance in the input space is so important is that the input space is the real world. It is generally (not always, of course) useful for biological organisms to make similar responses to similar situations--this is what we call "generalization". For this reason, whatever kind of representation is used, it probably should not distort the real-world metric too much. It is perhaps too easy when thinking in terms of mathematical abstractions to forget what the purpose of all these transformations might be. Regards, George Reeke Laboratory of Biological Modelling The Rockefeller University 1230 York Avenue New York, NY 10021 phone: (212)-327-7627 email: reeke at lobimo.rockefeller.edu From sirosh at hnc.com Mon Aug 17 14:45:30 1998 From: sirosh at hnc.com (Sirosh, Joseph) Date: Mon, 17 Aug 1998 11:45:30 -0700 Subject: What have neural networks achieved? Message-ID: Michael, A significant commercial success of neural networks has been in credit card fraud detection. The Falcon credit card fraud detection package, developed by HNC Software Inc. of San Diego (http://www.hnc.com/), uses supervised neural networks, covers over 260 million credit cards worldwide, and generates several tens of millions in annual revenue. Attached is a corporate blurb that gives more info about HNC and some of its products. There's more info on the company web page. Sincerely, Joseph Sirosh Senior Staff Scientist Exploratory R&D Group HNC Software Inc. ================================= Headquartered in San Diego, California, HNC Software Inc. (NASDAQ: HNCS) is the leading vendor of computational intelligence software solutions for the financial, insurance, and retail markets, and U.S. Government customers. HNC Software and its subsidiaries - Risk Data Corporation, CompReview, Aptex, and Retek - use advanced technologies such as neural networks, context vector analysis, and expert rules to deliver powerful solutions for complex pattern recognition and predictive modeling problems. For the U.S. Government, HNC has developed systems for content based text retrieval, multimedia information retrieval, image understanding, and intelligent agents. For commercial markets, HNC is the leading supplier of credit card fraud detection systems, with 23 of the 25 largest U.S. financial institutions being HNC customers. HNC also develops a broad spectrum of additional products, including solutions for profitability analysis, bankruptcy prediction, worker's compensation claims management, retail information management, and database mining. Since its founding in 1986, HNC has grown along with its product offerings and, as of the end of fiscal year 1997, had over 700 employees and revenues of $113 million. ================================== > ---------- > From: Michael A. Arbib > Sent: Monday, August 17, 1998 11:28 AM > To: Sirosh, Joseph > Subject: What have neural networks achieved? > > > Recently, Stuart Russell addressed the following query to Fellows of the > AAAI: > > > > > This Saturday there will be a debate with John McCarthy, David Israel, > > > Stuart Dreyfus and myself on the topic of > > > "How is the quest for artificial intelligence progressing?" > > > This is widely publicized, likely to be partially televised, > > > and will be attended by a lot of journalists. > > > > > > For this, and for AAAI's future reference, I'd like to collect > > > convincing examples of progress, particularly examples that will > > > convince journalists and the general public. For now all I need > > > is a URL or other accessible pointer and a one or two sentence > > > description. (It does not *necessarily* have to be your own work!) > > > Pictures would be very helpful. > > > > This spurs me as I work on the 2nd edition of the Handbook of Brain > Theory > > and Neural Networks (due out in 2 years or so; MIT Press has just issued > a > > paperback of the first edition) to pose to you two related questions: > > > > a) What are the "big success stories" (i.e., of the kind the general > public > > could understand) for neural networks contributing to the understanding > of > > "real" brains, i.e., within the fields of cognitive science and > > neuroscience. > > > > b) What are the "big success stories" (i.e., of the kind the general > public > > could understand) for neural networks contributing to the construction > of > > "artificial" brains, i.e., successfully fielded applications of NN > hardware > > and software that have had a major commercial or other impact? > > > > > > > > ********************************* > > Michael A. Arbib > > USC Brain Project > > University of Southern California > > Los Angeles, CA 90089-2520, USA > > arbib at pollux.usc.edu > > (213) 740-9220; Fax: 213-740-5687 > > http://www-hbp.usc.edu/HBP/ > > > > > > > -- > From goldfarb at unb.ca Mon Aug 17 17:09:01 1998 From: goldfarb at unb.ca (Lev Goldfarb) Date: Mon, 17 Aug 1998 18:09:01 -0300 (ADT) Subject: Connectionist symbol processing: any progress? In-Reply-To: <199808160830.UAA08508@rialto.mcs.vuw.ac.nz> Message-ID: On Sun, 16 Aug 1998, Tony Plate wrote: > One of things that has recently renewed my interest in the > idea of using distributed representations for processing > complex information was finding out about Latent Semantic > Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method > for taking a large corpus of text and constructing vector > representations for words in such a way that similar words > are represented by similar vectors. LSA works by > representing a word by its context (harkenning back to a > comment I recently saw attributed to Firth 1957: "You shall > know a word by the company it keeps" :-), and then reducing > the dimensionality of the context using singular value > decomposition (SVD) (v. closely related to principal > component analysis (PCA)). The vectors constructed by LSA > can be of any size, but it seems that moderately high > dimensions work best: 100 to 300 elements. In connection with the above, in my Ph.D. (published as "A new approach to pattern recognition", in Progress in Pattern recognition 2, ed. Kanal and Rosenfeld, North-Holland, 1985, pp. 241-402) I have proposed to replace the PATTERN RECOGNITION PROBLEM formulated in an input space with a distance measure defined on it by the corresponding problem in a UNIQUELY constructed pseudo-Euclidean vector space (through a uniquely constructed isometric, i.e. distance preserving, embedding of the training set, using, of course SVD of the distance matrix). The classical recognition techniques can then be generalized to the pseudo-Euclidean space and the PATTERN RECOGNITION PROBLEM can then be solved more efficiently than in a general distance space setting. The model is OK, IF YOU HAVE THE RIGHT DISTANCE MEASURE, i.e. if you have the distance measure that capture the CLASS representation and therefore provides a good separation of the class from its complement. However, in general, WHO WILL GIVE YOU THE "RIGHT" DISTANCE MEASURE? I now believe that the construction of the "right" distance measure is a more basic, INDUCTIVE LEARNING, PROBLEM. In a classical vector space setting, this problem is obscured because of the rigidity of the representation space (and, as I have mentioned earlier, because of the resulting uniqueness of the metric), which apparently has not raised any substantiated suspicions in non-cognitive sciences. I strongly believe that this is due to the fact that the classical measurement processes are based on the concept of number and therefore as long as we rely on such measurement processes we are back where we started from--vector space representation. On Sun, 16 Aug 1998, Mitsu Hadeishi wrote: > It is quite often possible to describe one representation in terms of another; symbolic in > terms of numbers, and vice-versa. What does this prove? You can say numbers are an > alphabet with only one letter; I can describe alphabets with numbers, too. > > The real question is, which representation is natural for any given problem. Obviously > symbolic representations have value and are parsimonious for certain problem domains, or > they wouldn't have evolved in nature. But to say your discovery, great as it might be, is > the only "natural" representation seems rather strange. Clearly, mechanics can be > described rather elegantly using numbers, and there are lots of beautiful symmetries and > so forth using that description. I am willing to believe other descriptions may be better > for other situations, but I do not believe that it is reasonable to say that one can be > certain that any given representation is *clearly* more natural than another. It depends > on the situation. Symbolic representations have evolved, but so have numeric > representations. They have different applications, and you can transform between them. > Is one fundamentally "better" than another? Maybe better for this or that problem, but I > do not believe it is reasonable to say they are better in some absolute sense. > > I am a "representation agnostic." (Mitsu, my apologies for the paragraph in italics in my last message: I didn't intend to "shout".) Concluding my brief discussion of the "connectionist symbol processing", I would like to say that I'm not at all a "representation agnostic". Moreover, I believe that the above "agnostic" position is a defensive mechanism that the mind has developed in the face of the mess that has been created out of the representation issues during the last 40 years. During this time, with the full emergence of computers, on the one hand, the role of non-numeric representations has begun to increase (see, for example, "Forms of Representation", ed. Donald Peterson, Intellect Books, 1996) and, at the same time, partly due to the disproportionate and inappropriate influence of the computability theory (again, related to the former), the concept of representation became relativized, as Mitsu so succinctly and quite representatively articulated above and throughout the entire discussion. Computability theory (and, ironically, the entire logic) has not dealt with the representational issues, because, basically, it has ignored the nature of intelligent computational processes, and thus, for example, the central, I believe, issue of how to construct the inductive class representation has not been addressed within it. My purpose for participating in this brief discussion (spread over the several messages) has been to strongly urge both theoretical and applied cognitive scientists to take the representation issue much more seriously and treat it with all the respect one can muster, i.e. to assume that the input, or representation, space is all we have and all we will ever have, and, as the mathematical (not logical) tradition of the past several thousand years strongly suggests, the operations of the representation space "make" this space. All other operations not related to the original space operations become then essentially invisible. For us, this path leads (unfortunately, very slowly) to a considerably more "non-numeric" mathematics that has been historically the case so far, and, at the same time, it inevitably leads to the "symbolic", or inductive, measurement processes, in which the outcome of the measurement process is not a number but a structured entity which we call "struct". Such measurement processes appear to be far-reaching generalizations of the classical, or numeric, measurement processes. Best regards and cheers, Lev From max at currawong.bhs.mq.edu.au Mon Aug 17 17:42:16 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Tue, 18 Aug 1998 07:42:16 +1000 (EST) Subject: connectionist symbol processing Message-ID: <199808172142.HAA20682@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 1442 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/ba92a509/attachment-0001.ksh From zhuh at santafe.edu Mon Aug 17 19:33:13 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Mon, 17 Aug 1998 17:33:13 -0600 (MDT) Subject: Evaluating learning algorithms (Was Re: Connectionist symbol processing: any progress?) In-Reply-To: <35D65650.3B387A0D@ministryofthought.com> Message-ID: Dear Connectionists, First, I apologize for jumping into an already lengthy debate between Lev Goldfarb and Mitsu Hadeishi. I will try to be succinct. I hope you will find the following worth reading. To begin with, let me claim that it is not contradictory to say: 1. The most important issue IS performance of learning rules. 2. Some quantitative measurement (loosely called "metric") is needed. 3. There is no intrinsic Euclidean metric on the space of learning algorithms. 4. The geometry and topology of input space is generally irrelevant. Now let me explain why they must be true, and what kind of theory can be developed to satisfy all of them: The key observation is that learning algorithms act on the space of probability distributions. Unless we are doing rote learning, we cannot assume the data are generated by a deterministic mechanism. Therefore the proper measurement should be derived from some geometry of the space of probability distributions. Now it is clear that even for a discrete set X, the space of functions on X is still a linear space equipped with norms and so on, and the space of probability distributions on X is still a differentiable manifold with intrinsic geometric structures. In fact the function space case can always be regarded as a special case of probability space. The space of probability distributions is not a Euclidean space. However, the beautiful theory of Information Geometry developed by Amari and others shows that it behaves almost as if it has a squared distance. For example, there is a Pythagorean Theorem for cross-entropy that enables us to do things very similar to taking averages and projecting on a linear subspace in an Euclidean space. Information Geometry fully specifies the amount of information that is lost by various computations. (A learning algorithm tries to reduce data with minimum reduction of information.) However, our information is not solely contained in the data. Different conclusions can be drawn from the same data if different statistical assumptions are made. Such assumptions can be specified by a prior (a distribution of all the possible distributions). Technically this is called the complete class theorem. The prior and the data should be consistently combined in a Bayesian method. It can be shown that learning algorithm can be completely quantitatively evaluated in this framework. The input space is generally irrelevant because usually we only want to learn the conditional distribution of output for given input. In the case of unsupervised learning, such as independent component analysis, it is still the space of distributions that is relevant. Again, information geometry has helped to make several breakthroughs in recent years. (Check for papers by Amari, Cardoso, Sejnowski and many others. See, eg., http://www.cnl.salk.edu/~tewon/ica_cnl.html.) The following short paper (4 pages) contains an outline with a little bit more technical detail. The numerous articles by Prof Amari, and esp his 1985 book, should prove extremely useful regarding information geometry. H. Zhu, Bayesian geometric theory of learning algorithms. Proc. of Intnl. Conf. Neural Networks (ICNN'97), Vol.2, pp.1041-1044. Houston, 9-12 June, 1997. ftp://ftp.santafe.edu/pub/zhuh/ig-learn-icnn.ps Hope you enjoyed reading to this point. Huaiyu -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From Sougne at forum.fapse.ulg.ac.be Tue Aug 18 04:40:13 1998 From: Sougne at forum.fapse.ulg.ac.be (Jacques Sougne) Date: Tue, 18 Aug 1998 10:40:13 +0200 Subject: Connectionist symbol processing: any progress? In-Reply-To: <23040.902820867@skinner.boltz.cs.cmu.edu> Message-ID: Hi, I am working on modeling deductive reasoning using a distributed network of spiking nodes. Variable binding is achieved by temporal synchrony while it is not a new technique, the use of distributed representation, and the way it solves the problem of multiple instantiation is new. I called the model INFERNET I obtained good results with conditional reasoning even with negated conditional. all these forms: A=>B, A; A=>B, ~A; A=>B, B; A=>B, ~B A=>~B, A; A=>~B, ~A; A=>~B, ~B; A=>~B, B ~A=>B, ~A; ~A=>B, A; ~A=>B, B; ~A=>B, ~B ~A=>~B, ~A; ~A=>~B, A; ~A=>~B, ~B; ~A=>~B, B The INFERNET performance fit human data which are sensitive to negation. The effect of negation is often referred as negative conclusion bias. I also worked on problem requiring multiple instantiations (see Sougne, 1998a; Sougne, 1998b). In INFERNET multiple instantiation is achieved by using the neurobiological phenomena of period doubling. Nodes pertaining to a doubly instantiated concept will sustain two oscillation. This means that these nodes will be able to synchronize with two different set of nodes. The INFERNET performance seems to fit human data for problems requiring multiple instantiation like: Mark loves Helen and Helen loves John. Who is jealous of whom? Due to distributed representation I also found an effect of similarity of the concepts used in deductive tasks which are confirmed by empirical evidences. I also found an interesting effect of noise. When white noise is added in the system (and if it is not too important) the performance of the system is improved. This phenomenon is known as Stochastic resonance (see Levin & Miller 1996, Sougne, 1998b). Description of my work can be found in: Sougne, J. (1996). A Connectionist Model of Reflective Reasoning Using Temporal Properties of Node Firing. Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society. (pp. 666-671) Mahwah, NJ: Lawrence Erlbaum Associates. Sougn, J. (1998). Connectionism and the problem of multiple instantiation. Trends in Cognitive Sciences, 2, 183-189. Sougn, J. (1998). Period Doubling as a Means of Representing Multiply Instantiated Entities. Proceedings of the twentieth Annual Conference of the Cognitive Science Society. (pp. 1007-1012) Mahwah, NJ: Lawrence Erlbaum Associates. Sougne, J. and French, R. M. (1997). A Neurobiologically Inspired Model of Working Memory Based on Neuronal Synchrony and Rythmicity. In J. A. Bullinaria, D. W Glasspool, and G. Houghton (Eds.) Proceedings of the Fourth Neural Computation and Psychology Workshop: Connectionist Representations. London: Springer-Verlag. some preliminary versions are available at: http://www.fapse.ulg.ac.be/Lab/Trav/jsougne.html Some of the recently collected data are not yet published, if you are interested, you can contact me. j.sougne at ulg.ac.be References Sougn, J. (1998a). Connectionism and the problem of multiple instantiation. Trends in Cognitive Sciences, 2, 183-189. Sougn, J. (1998b). Period Doubling as a Means of Representing Multiply Instantiated Entities. Proceedings of the twentieth Annual Conference of the Cognitive Science Society. (pp. 1007-1012) Mahwah, NJ: Lawrence Erlbaum Associates. Levin, J. E. and Miller, J. P. (1996). Broadband neural encoding in the cricket cercal sensory system enhanced by stochastic resonance. Nature, 380, 165-168. Jacques From Rob.Callan at solent.ac.uk Tue Aug 18 07:08:48 1998 From: Rob.Callan at solent.ac.uk (Rob.Callan@solent.ac.uk) Date: Tue, 18 Aug 1998 12:08:48 +0100 Subject: Connectionist symbol processing: any progress? Message-ID: <80256664.003DC810.00@hercules.solent.ac.uk> Dear Bryan I was interested to read your reponse to Tony Plate's message: "Tony Plate's response is interesting and I, for one, will have to give it some thought. I am not certain that > concepts, respectively. For example, one can build a distributed > representation for a shape configuration#33 of "circle above > triangle" as: config33 = vertical + circle + triangle > + > ontop*circle + > below*triangle > > By using an appropriate multiplication > operation (I used > circular, or wrapped, convolution), the > reduced > representation of the compositional concept (e.g., > config33) > has the same dimension as its components, and can > readily be > used as a component in other higher-level relations. > Quite is inherently different from a spatial approach and, hence, a localist approach itself. You need to have enough dimensionality to represent the key features as well as enough to multiply them out by the key relational features -- quite a few dimensions, even if some of that dimensionality is pushed off into numerical precision..." I think this point "You need to have enough dimensionality to represent the key features" has often been overlooked. I am speaking in particular about RAAM's of which I have most experience. One of the great attractions of reduced representations is their potential to be used in holistic processes. However, it appears that the greater the 'reduction' the harder it is for holistic processing. Boden & Niklasson (1995) showed that for a set of tree structures encoded with a RAAM, the structure was maintained but the influence of constituents was not necessarily available for holistic processing. About 3 years ago we developed (S)RAAM (simplified RAAM - see Callan & Palmer-Brown 1997) which uses PCA and a recursive procedure to produce matrices that simulate the first and second weight-layers of a RAAM. Unlike RAAMs, (S)RAAMs cannot reduce the 'representational width' beyond the redundancy present in the training set. One of my student's (John Flackett) has repeated Boden and Niklasson's experminent with (S)RAAM and results (unsurprisingly) show a significant improvement over their RAAM. The action of the recursive process also appears to impose a weighting of the constituents but this is to be further explored. The weighting may prove useful for some tasks (e.g., planning) and so is not necessarily a bad thing for all forms of holistic processing. It is also clear to me that (S)RAAMs have no capability to exhibit 'strong systematicity' and I believe the same is true of RAAMs. I am not ruling out the possibility of strong systematic behavior when RAAMs etc., are used in a modular system (some impressive results were demonstrated by Niklasson & Sharkey 1997). For the general reading list, two recent papers that offer some interesting discussion are: Steven Phillips - examines systematicity in feedforwad and recurrent networks - ref below James Hammerton - general discussion and definition of holistic computation - ref below Callan R, Plamer-Brown D (1997). (S)RAAM: An Analytical Technique for Fast and Reliable Derivation pf Connectionist Symbol structure Representations. Connection Science, Vol 9, No 2. BodenM, Niklasson L (1995). Feature of Distributed Representations for Tree-structures: A Study of RAAM. Presented at the 2nd Swedish Conference on Connectionism. Published in Current trends in Connectionism (Niklasson & Boden eds.) Lawrence Erlbaum Associates. Niklasson L, sharkey N E (1997) Systematicity and generalization in compositional connectionist representations, in G Dorffner (ed), Neural Networks and a New artificial Intelligence. International Thomson Computer Press. Phillips S (1998). are feedforward and Recurrent Networks Systematic? Analysis and Implications for a Connectinist Cognitive Architecture. Connection Science, Vol 10, No 2. Hammerton J (1998) Holistic Computation: Reconstructing a Muddled Concept. Connection Science, Vol 10, No 1. From scheler at informatik.tu-muenchen.de Tue Aug 18 07:37:49 1998 From: scheler at informatik.tu-muenchen.de (Gabriele Scheler) Date: Tue, 18 Aug 1998 13:37:49 +0200 Subject: Connectionist symbol processing: any progress? Message-ID: <98Aug18.133754+0200_met_dst.7649-22196+91@papa.informatik.tu-muenchen.de> As the question of metrics in pattern recognition seems to be of some controversy, here is my point of view: It is quite possible to construct algorithms for learning the metric of some set of patterns in the supervised learning paradigm. This means if we have a set of patterns for class A and a set of patterns for class B we can induce a metric such that class A patterns are close (similar) to each other and dissimilar to class B patterns. This metric will usually be rather distinct from metrics based on numeric properties, such as the Euclidean metric. It is especially useful in the case of binary patterns, which code some features with several distinct ("symbolic") values. Such a metric can be rather involved, for instance the metrics that determine phonemic similarity in different languages are quite different. (Think of "r" and "l" to speakers of Indo-european and some Asian languages). This approach has therefore been applied to the question of how phonemes (classes) relate to phonetic features (pattern sets). However, I believe as has also been pointed out by Pao and possibly Kohonen, the problems of learning distance metrics or of finding hypersurfaces (as in back-propagation) are in substance related. In the first case, we have a distorted topology (with respect to Euclidean space) but simple dividing lines (such as circles) , in the second case the toplogy stays fixed (euclidean space), but dividing surfaces may be rather complex. Although in certain cases, we may prefer one method rather than the other, the question of symbolic vs. non-symbolic ("numeric"?) representations really has not much to do with it. Recall that back-propagation is one method of a universal function approximation, and you find that any class distinction is approximable provided you have found a sufficient and suitable training set. (which is of course the practical problem.) Nonetheless efforts to build pattern classification schemes based on an induction of different metrics for different problems are I believe really interesting and may change some ideas on what constitute "easy" and "hard" problems. (For instance I agree with Lev, that classification according to parity, as well as several symmetry etc. problems become very easy with distance-measure-based approaches.) Gabriele References: Pao, Y.H.: Adaptive Pattern recognition and Neural Networks. Addison-Wesley, 1989. Kohonen, T.: Self-Organization and Associative Memory. Springer 1989. Scheler,G: Feature Selection with Exception Handling- An Example from Phonology, In; Trappl,R. (ed.) Proceedings of EMCSR 94, Springer, 1994. Scheler, G: Pattern Classification with Adaptive Distance Measures, Tech Rep FKI-188-94, 1994. (some more references at www7.informatik.tu-muenchen.de/~scheler/publications.html). From rsun at research.nj.nec.com Tue Aug 18 12:46:00 1998 From: rsun at research.nj.nec.com (Ron Sun) Date: Tue, 18 Aug 1998 12:46:00 -0400 Subject: CFP: Journal of Cognitive Systems Research Message-ID: <199808181646.MAA08213@pc-rsun.nj.nec.com> Subject: Call for Papers: new electronic Journal of Cognitive Systems Research CALL FOR PAPERS Journal of Cognitive Systems Research Editors-in-Chief Ron Sun E-mail: rsun at cs.ua.edu Department of Computer Science and Department of Psychology University of Alabama Tuscaloosa AL, USA Vasant Honavar E-mail: honavar at cs.iastate.edu Department of Computer Science Iowa State University USA Gregg Oden E-mail: gregg-oden at uiowa.edu Department of Psychology University of Iowa USA The Journal of Cognitive Systems Research covers all topics in the study of cognitive processes, in both natural and artificial systems: Knowledge Representation and Reasoning Learning Perception Action Memory Problem-Solving and Cognitive Skills Language and Communication Agents Integrative Studies of Cognitive Systems The journal emphasizes the integration/synthesis of ideas, concepts, constructs, theories, and techniques from multiple paradigms, perspectives, and disciplines, in the analysis, understanding and design of cognitive and intelligent systems. Contributions describing results obtained within the traditional disciplines are also sought if such work has broader implications and relevance. The journal seeks to foster and promote the discussion of novel approaches in studying cognitive and intelligent systems. It also encourages cross-fertilization of disciplines, by publishing high-quality contributions in all of the areas of study, including artificial intelligence, linguistics, psychology, psychiatry, philosophy, system and control theory, anthropology, sociology, biological sciences, and neuroscience. The scope of the journal includes the study of a variety of different cognitive systems, at different levels, ranging from social/cultural cognition, to individual cognitive agents, to components of such systems. Of particular interest are theoretical, experimental, and integrative studies and computational modeling of cognitive systems, at different levels of detail, and from different perspectives. Please send submissions in POSTSCRIPT format by electronic mail to one of the three co-Editors-in-Chief. Note The journal transends traditional disciplinary boundaries, and considers contributions from all relevant disciplines and approaches. The key is the quality of the work and the accessibility and relevance to readers in different disciplines. The first issue of this new on-line journal, published by ElsevierScience, will appear in early 1999. In addition to this electronic journal, the issues will also be printed and bound as archival volume. Published papers will be considered automatically for inclusion in specially edited books on Cognitive Systems Research. For further information, see: http://cs.ua.edu/~rsun/journal.html -------------------- action editors: John Barnden, School of Computer Science, University of Birmingham, U.K.\\ William Bechtel, Department of Philosophy, Washington University, St. Louis, USA.\\ Rik Belew, Computer Science and Engineering Department, University of California, San Diego, USA.\\ Mark H. Bickhard, Department of Psychology, Lehigh University, USA.\\ Deric Bownds, Dept. of Zoology, University of Wisconsin, Madison, USA. \\ David Chalmers, Department of Philosophy, University of California, Santa Cruz, USA. \\ B. Chandrasekaran, Department of Computer and Information Science, Ohio State University, USA.\\ Marco Dorigo, University of Brussels, Brussels, Belgium\\ Michael Dyer, Computer Science Department, University of California, Los Angeles, USA.\\ Lee Giles, NEC Research Institute, Princeton, New Jersey, USA. \\ George Graham, Philosophy Department, University of Alabama at Birmingham, Birmingham, AL, USA.\\ Stephen J.Hanson, Psychology Dept., Rutgers University, Newark, New Jersey, USA.\\ Valerie Gray Hardcastle, Dept. of Philosophy, Virginia Polytechnic and State University, Blacksburg, Virginia, USA.\\ James Hendler, Department of Computer Science, University of Maryland, College Park, USA.\\ Stephen M. Kosslyn, Department of Psychology, Harvard University, USA. \\ George Lakoff, Dept. of Linguistics, University of California, Berkeley, USA.\\ Joseph LeDoux, Center for Neuroscience, New York University, New York, USA.\\ Daniel Levine, Department of Psychology, University of Texas at Arlington, USA.\\ Vladimir J. Lumelsky, Robotics Laboratory, Department of Mechanical Engineering, University of Wisconsin, Madison, USA.\\ James Pustejovsky, Brandeis University, Massachusetts, USA.\\ Lynne M. Reder, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.\\ Jude Shavlik, Computer Sciences Department, University of Wisconsin, Madison, USA.\\ Tim Shallice, Department of Psychology, University College, London, UK \\ Aaron Sloman, School of Computer Science, The University of Birmingham, UK. \\ Paul Thagard, Philosophy Department, University of Waterloo, Canada.\\ Leonard Uhr, Computer Sciences Department, University of Wisconsin, Madison, USA.\\ David Waltz, NEC Research Institute, Princeton, NJ, USA.\\ Xin Yao, Dept. of Computer Science, Australian Defense Force Academy, Canberra, Australia.\\ From dsilver at csd.uwo.ca Tue Aug 18 16:57:14 1998 From: dsilver at csd.uwo.ca (Danny L. Silver) Date: Tue, 18 Aug 1998 16:57:14 -0400 (EDT) Subject: Connectionist symbol processing & work by Baxter Message-ID: <199808182057.QAA05573@church.ai.csd.uwo.ca> Along the lines of Dr. Zhu, I would like to point out the important work of Jon Baxter concerning the "learning of internal represenations" in which he develops a "canonical distortion measure" The CDM is in fact a metric over the input space defined by the probability distribution over a domain of tasks (e.g. character recognition) each of which shares the input space. The metric can be used to measure the similarity of input vectors. The important aspect of Baxter's work is that he shows formally and demonstrates imperically that the CDM for a particular task (or environmental) domain can be LEARNED to the desired level of accuracy if the learner samples sufficiently from the domain of tasks. i.e if the learner "experiences" the environment long enough and well enough. Once learned this CDM metric can be used to facilitate learning any new task from the domain - thus it can be consider a process of "learning to learn". Baxter, in fact demonstrates how the CDM metric can be learned within the hidden node representations of a neural network for a simple task domain. Based on this .. I would conclude that the facility of symbolic representation and logical metrics is largely a function of the domain of tasks under consideration and not necessarily THE best representation or metric. For details please refer to: Jonathan Baxter. Learning Internal Representations. PhD Thesis, Dept. Mathematics and Statss, The Flinders University of South Australia, 1995. Draft copy available in Neuroprose Archive - /pub/neuroprose/Thesis/baxter.thesis.ps.Z Jonathan Baxter. The Canonical Distortion Measure for Vector Quantization and Function Approximation. Learning to Learn, edited by Sebastian Thrun and Lorien Pratt, 1998, Kluwer Academic Publishers, p.159-179 Cheers .. Danny Silver -- ========================================================================= = Daniel L. Silver University of Western Ontario, London, Canada = = N6A 3K7 - Dept. of Comp. Sci. = = dsilver at csd.uwo.ca H: (902)582-7558 O: (902)494-1813 = = WWW home page .... http://www.csd.uwo.ca/~dsilver = ========================================================================= From giles at research.nj.nec.com Tue Aug 18 18:17:41 1998 From: giles at research.nj.nec.com (Lee Giles) Date: Tue, 18 Aug 1998 18:17:41 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: from "Michael A. Arbib" at Aug 14, 98 02:07:20 pm Message-ID: <199808182217.SAA26188@alta.nj.nec.com> Michael Arbib wrote: > > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? > > > > ********************************* > Michael A. Arbib > USC Brain Project > University of Southern California > Los Angeles, CA 90089-2520, USA > arbib at pollux.usc.edu > (213) 740-9220; Fax: 213-740-5687 > http://www-hbp.usc.edu/HBP/ > > For those of you who might have missed it, an entire issue of IEEE TNN was devoted to "practical uses of NNs." The issue was IEEE Transactions on Neural Networks Volume 8, Number 4 July 1997 Table of Contents plus page numbers are below: 827... Neural Fraud Detection in Credit Card Operations Jose R. Dorronsoro, Francisco Ginel, Carmen Sanchez, and Carlos Santa Cruz 835... ANNSTLF--A Neural-Network-Based Electric Load Forecasting System Alireza Khotanzad, Member, IEEE, Reza Afkhami-Rohani, Tsun-Liang Lu, Alireza Abaye, Member, IEEE, Malcolm Davis, and Dominic J. Maratukulam 847... A Deployed Engineering Design Retrieval System Using Neural Networks Scott D. G. Smith, Richard Escobedo, Michael Anderson, and Thomas P. Caudell 852... Modeling Complex Environmental Data Chris M. Roadknight, Graham R. Balls, Gina E. Mills, and Dominic Palmer-Brown 863... Neural Networks and Traditional Time Series Methods: A Synergistic Combination in State Economic Forecasts James V. Hansen and Ray D. Nelson 874... Reliable Roll Force Prediction in Cold Mill Using Multiple Neural Networks Sungzoon Cho, Member, IEEE, Yongjung Cho, and Sungchul Yoon 883... Dynamic Neural Control for a Plasma Etch Process Jill P. Card, Member, IEEE, Debbie L. Sniderman, and Casimir Klimasauskas 902... Application of Neural Networks to Software Quality Modeling of a Very Large Telecommunications System Taghi M. Khoshgoftaar, Member, IEEE, Edward B. Allen, Member, IEEE, John P. Hudepohl, and Stephen J. Aud 910... Neural Intelligent Control for a Steel Plant Gerard Bloch, Member, IEEE, Franck Sirou, Vincent Eustache, and Philippe Fatrez 919... Characterization of Aluminum Hydroxide Particles from the Bayer Process Using Neural Network and Bayesian Classifiers Anthony Zaknich, Member, IEEE 932... Fuzzy Neural Networks for Machine Maintenance in Mass Transit Railway System James N. K. Liu, Member, IEEE, and K. Y. Sin 942... Dynamic Security Contingency Screening and Ranking Using Neural Networks Yakout Mansour, Fellow, IEEE, Ebrahim Vaahedi, Senior Member, IEEE, and Mohammed A. El-Sharkawi, Fellow, IEEE 951... Self-Calibration of a Space Robot Vicente Ruiz de Angulo and Carme Torras 964... Cork Quality Classification System using a Unified Image Processing and Fuzzy-Neural Network Methodology Joongho Chang, Gunhee Han, Jose M. Valverde, Norman C. Griswold, Senior Member, IEEE, J. Francisco Duque-Carrillo, Member, IEEE, and Edgar Sanchez-Sinencio, Fellow, IEEE Those of you who are IEEE members and have access can download these papers from http://opera.ieee.org/opera/browse.html Best regards Lee Giles -- __ C. Lee Giles / Computer Science / NEC Research Institute / 4 Independence Way / Princeton, NJ 08540, USA / 609-951-2642 / Fax 2482 www.neci.nj.nec.com/homepages/giles == From Jon.Baxter at keating.anu.edu.au Tue Aug 18 18:57:52 1998 From: Jon.Baxter at keating.anu.edu.au (Jonathan Baxter) Date: Wed, 19 Aug 1998 08:57:52 +1000 (EST) Subject: Connectionist symbol processing: any progress? Message-ID: <199808182257.IAA22931@reid.anu.edu.au> Forwarded message: > From owner-neuroz at munnari.oz.au Tue Aug 18 19:25:34 1998 > Date: Mon, 17 Aug 1998 18:09:01 -0300 (ADT) > From: Lev Goldfarb > X-Sender: goldfarb at sol.sun.csd.unb.ca > Reply-To: Lev Goldfarb > To: connectionists at cs.cmu.edu, inductive at unb.ca > Subject: Re: Connectionist symbol processing: any progress? > In-Reply-To: <199808160830.UAA08508 at rialto.mcs.vuw.ac.nz> > Message-Id: > Mime-Version: 1.0 > Content-Type: TEXT/PLAIN; charset=US-ASCII > > On Monday August 17, Lev Goldfarb wrote: > > However, in general, WHO WILL GIVE YOU THE "RIGHT" DISTANCE MEASURE? I now > believe that the construction of the "right" distance measure is a more > basic, INDUCTIVE LEARNING, PROBLEM. In a classical vector space setting, > this problem is obscured because of the rigidity of the representation > space (and, as I have mentioned earlier, because of the resulting > uniqueness of the metric), which apparently has not raised any > substantiated suspicions in non-cognitive sciences. I strongly believe > that this is due to the fact that the classical measurement processes are > based on the concept of number and therefore as long as we rely on such > measurement processes we are back where we started from--vector space > representation. Just to add a note on this point of "what is the right distance measure" and "where do you get it": it is reasonably clear that if you are faced with just a single learning problem finding the right distance measure is equivalent to the learning problem itself. After all, in a classification setting the perfect distance measure would set the distance between examples belonging to the same class to zero, and the distance between examples belonging to different clesses to some positive number. One-nearest-neighbour classification with such a distance metric and a training set containing at least one example of each class will have zero error. In contrast, in a "learning to learn" setting where a learner is faced with a (potentially infinite) *sequence* of learning tasks one can ask that the learner learns a distance metric that is in some sense appropriate for all the tasks. I think it is this sort of metric that people are thinking of when they talk about "the right distance measure". For example, in my life I have to learn to recognize thousands of faces, not just a single face. If I learn a distance measure that works for just a single face (say, just distinguish my father from everybody else) then that distance measure is unlikely to be the "right" measure for faces; it would most likely focus on some idiosyncratic feature of his face in order to make the distinction and would thus be unusable for distinuishing faces that don't possess such a feature. However, if I learn a distance measure that works for a large variety of faces, then that distance measure is more likely to focus on the "true" invariants of people's faces and hence has more chance of being the "right" measure. Anyway, to cut a long story short, you can formalize this idea of learning the right distance measure for a number of related tasks---I had papers on this in NIPS and ICML last year. You can also get them from my web page: http://wwwsyseng.anu.edu.au/~jon/papers/nips97.ps.gz http://wwwsyseng.anu.edu.au/~jon/papers/icml97.ps.gz. This idea has turned up in a number of different guises in various places (here is a few): Shimon Edelman. Representation, Similarity and the Chorus of Protoypes. Minds and Machines, 5, 1995. Oehler and Gray. Combining image compression and classification using vector quantization. IEEE Transactions on PAMI. 17(5): 461--473. 1995. Thrun and Mitchell. Learning one more thing. TR CS-94-184, CMU, 1994. Cheers, Jon ------------- Jonathan Baxter Department of Systems Engineering Research School of Information Science and Engineering Australian National University http://keating.anu.edu.au/~jon Tel: +61 2 6279 8678 Fax: +61 2 6279 8688 From curt at doumi.ucr.edu Wed Aug 19 01:49:49 1998 From: curt at doumi.ucr.edu (Curt Burgess) Date: Tue, 18 Aug 98 22:49:49 -0700 Subject: Connectionist symbol processing: any progress? LSA & HAL models Message-ID: <9808190549.AA07717@doumi.ucr.edu> > One of things that has recently renewed my interest in the > idea of using distributed representations for processing > complex information was finding out about Latent Semantic > Analysis/Indexing (LSA/LSI) at NIPS*97. LSA is a method > for taking a large corpus of text and constructing vector I think LSA is an important approach in this symbol processing debate. A model similar in many ways to LSA is our Hyperspace Analogue to Language (HAL) model of memory (also at NIPS*97 [workshops]). One difference is that LSA (typically) is implemented in a matrix of word by larger text unit dimensions. HAL is a word by word matrix. There are other differences - one being how dimensionality is reduced. One big advantage of HAL and LSA is that they use learning procedures that scale up to real world language problems and thus can use large corpora as input. It would be difficult to put 300 million words through a SRN the number of times required for any learning to take place (!). With global co-occurrence models like HAL or LSA, this scalability isn't a problem. HAL and LSA also use continuous valued vector representations which results in very rich encoding of meaning. We've addressed the scalability issue by comparing HAL's algorithm to a SRN in a chapter available on my lab's web page (http://locutus.ucr.edu/Reprints.html) - get the Burgess & Lund, 1998, under review, document). We show that the same input into a SRN and HAL will get virtually identical output. The beauty of this is that one can use vector representations acquired in a global co-occurrence model in a connectionist model knowing that these vectors are what would likely be produced if they were learned via a connectionist methodology. In this chapter we also address a variety of other related issues (what is similarity? the symbol-grounding problem, the relationship between associations and categorical knowledge, modularity and syntactic constraints, developing asymmetric relationships between words, and, in a limited way, using high-dimensional models to mimic higher-level cognition). The chapter was written to be a little provocative. There are 6 or 7 papers detailing the HAL model available as PDFs and another 6 or 7 that you can order with the reprint order form. The latest issue of Discourse Processes (edited by Peter Foltz) is a special issue on quantitative approaches to language and is full of LSA and HAL papers. I will be editing a special journal issue that will have more HAL and LSA papers (a followup to the high-dimensional semantic space symposium at psychonomics last year). At the SCiP (Society for Computers in Psychology) conference in Nov (the day before Psychonomics), we will have a symposium on high-dimensional semantic memory models and Tom Landauer is giving the keynote talk (titled "How modern computation can turn cognitive psychology into a real science" - I suspect also a little provocative!). I gave the keynote at last years SCiP meeting and this is available on our website and is in the last issue of BRMIC (Burgess, C. (1998). From Simple Associations to the Building Blocks of Language: Modeling Meaning in Memory with the HAL Model. Behavior Research Methods, Instruments, and Computers, 30, 188 - 198.). It's a good brief introduction to the range of problems we've addressed. The HAL and LSA work certainly are related to the "context vector" research that Steve Gallant was talking about. I guess that's enough... Curt --- Dr. Curt Burgess, Computational Cognition Lab Department of Psychology, University of California Riverside, CA 92521-0426 URL: http://locutus.ucr.edu/ Internet: curt at cassandra.ucr.edu MaBellNet: (909) 787-2392 FAX: (909) 787-3985 From Sebastian_Thrun at heaven.learning.cs.cmu.edu Wed Aug 19 19:17:01 1998 From: Sebastian_Thrun at heaven.learning.cs.cmu.edu (Sebastian Thrun) Date: Wed, 19 Aug 1998 19:17:01 -0400 Subject: Organize a workshop at IJCAI-99? Message-ID: Dear Connectionists: This is to bring to your attention a great opportunity to organize a workshop at the forthcoming IJCAI conference (IJCAI stands for International Joint Conference on Artificial Intelligence), which will take place 31 July - 6 August 1999 in Stockholm, Sweden. IJCAI is a leading AI-conference, and in recent years there has been a good deal of overlap with meetings such as NIPS and Snowbird (e.g., work on learning, Bayesian methods). Organizing a workshop at IJCAI is a great way to get people outside the field involved in the type of work carried out in the "connectionsist" community. For IJCAI-99, we will particularly welcome workshop proposals with cross-cutting themes. If you are interested in submitting a proposal, please consult the Web page http://www.dsv.su.se/ijcai-99/ Deadline for proposals is Oct 1, 1998. Proposals will be selected on a competitive basis. Workshop topics of past IJCAI conferences can be found at http://www.ijcai.org/past/. Sebastian Thrun (workshop chair, IJCAI-99) From marks at maxwell.ee.washington.edu Wed Aug 19 15:58:43 1998 From: marks at maxwell.ee.washington.edu (Robert J. Marks II) Date: Wed, 19 Aug 1998 12:58:43 -0700 Subject: What have neural networks achieved? In-Reply-To: <199808182217.SAA26188@alta.nj.nec.com> References: Message-ID: <3.0.1.32.19980819125843.00706c44@maxwell.ee.washington.edu> At 06:17 PM 8/18/98 -0400, Lee Giles wrote: >Michael Arbib wrote: > >> >> b) What are the "big success stories" (i.e., of the kind the general public >> could understand) for neural networks contributing to the construction of >> "artificial" brains, i.e., successfully fielded applications of NN hardware >> and software that have had a major commercial or other impact? >> >> >> >> ********************************* >> Michael A. Arbib >> USC Brain Project >> University of Southern California >> Los Angeles, CA 90089-2520, USA >> arbib at pollux.usc.edu >> (213) 740-9220; Fax: 213-740-5687 >> http://www-hbp.usc.edu/HBP/ >> >> > >For those of you who might have missed it, an entire issue of IEEE TNN >was devoted to "practical uses of NNs." The issue was > >IEEE Transactions on Neural Networks >Volume 8, Number 4 July 1997 > Slides from a talk entitled "Neural Networks: Reduction to Practice" are on the web at http://cialab.ee.washington.edu/Marks-Stuff/icnn_97.html-ssi Nutshell summaries of the TNN papers in the special issue are given plus numerous other everyday uses of neural networks. From timmers at nici.kun.nl Thu Aug 20 04:12:26 1998 From: timmers at nici.kun.nl (renee timmers) Date: Thu, 20 Aug 1998 10:12:26 +0200 Subject: job announcement Message-ID: Postdoctoral Researchers in Music Cognition At the Nijmegen Institute of Cognition and Information (NICI) of the Nijmegen University a research team was set up in September 1997, supported by the Dutch Foundation for Scientific Research (NWO) as the PIONIER project "Music, Mind, Machine". This project aims at improving the understanding of the temporal aspects of musical knowledge and music cognition using computational models. The research is interdisciplinary in nature, with contributions from music theory, psychology and computer science. A number of studies is planned, grouped according to the following perspectives: the computational modeling methodology, the music domain itself, and applications of the findings. The methodological studies are concerned with the development of cognitive modeling languages, the study of (sub)symbolic formalisms, the development of programming language constructs for music, and the evaluation of physical metaphors in modeling expressive timing. The domain studies focus on specific temporal aspects of music, such as beat induction, grace note timing, musical expression and continuous modulations in music performance. In these areas both the construction of computational models and their experimental validation are being undertaken. The theoretical results will be applied in e.g., editors for musical expression for use in recording studios. In order to realize these aims, a multi-disciplinary research group was formed, in which teamwork and collaboration play a crucial role. It is expected that all team members are actively involved in building the team and the realization of the project's aims. The demands on the team members is high, conducting innovative and internationally recognized research. However, in return, our stimulating research environment provides adequate training and technical support, including a high-quality infrastructure and recording and music processing facilities. Close contact is maintained with the international community of researchers in this field. More information on the project and a description of the planned studies can be found at http://www.nici.kun.nl/mmm Ref 21.2.98 One postdoc will be responsible for improving an existing connectionist model for quantization and will design and validate this and other models and supervise their implementation. Quantization is the process of separating the categorical, discrete timing components -durations as notated in the musical score- from the continuous deviations in a musical performance. The project has, next to the fundamental aspects (connectionist models of categorical rhythm perception and their empirical validation), an important practical focus and aims at developing a robust component for automatic music transcription systems. The research will be realized at the lab for Medical and Biophysics (MBFYS) and at the Nijmegen Institute for Cognition and Information (NICI), both at the University of Nijmegen and is funded by the Dutch Foundation for Technical Sciences (STW). We are looking for a psychologist with experience in both experimental methods and in computational modeling. Experience with attractor networks is an advantage. Appointment will be full-time for three years, with a possible extension. Ref 21.3.98 The other position requires a Doctorate in Music Theory/Analysis, Psychology, or Music Cognition. A thorough knowledge of the music cognition literature is required, preferably centering on a computational modeling approach. In addition, the candidate needs to have ample practical experience in conducting experiments and a thorough knowldege of music theory. Although the project focuses on musical performance and rhythmic structure, research experience in these domains is not essential. He or she must be able and willing to collaborate with the other members of the team on existing research projects and contribute to the supervision of doctoral level research. The ability to communicate clearly and work as part of a team is crucial. Experience in collaboration with researchers from computer science, artificial intelligence, or music technology would be beneficial, as would some knowledge of these fields. Appointment will be full-time for two years, with a possible extension. The Faculty of Social Sciences intents to employ a proportionate number of women and men in all positions in the faculty. Women are therefore urgently invited to apply. The selection procedure may entail an assessment of collaboration and communication skills. Applications (three copies, in English or Dutch) including a curriculum vitae and a statement about the candidate's professional interests and goals, and one copy of recent work (e.g., thesis, computer program, article) should be mailed before the 1st of November to the Department of Personnel & Organization, Faculty of Social Sciences, Catholic University Nijmegen, P.O.Box 9104, 6500 HE Nijmegen, The Netherlands. Please mark envelope and letter with the appropriate vacancy number. Questions can be addressed to Renee Timmers: timmers at nici.kun.nl From mashouq at ix.netcom.com Thu Aug 20 00:28:31 1998 From: mashouq at ix.netcom.com (Dr. Khalid Al-Mashouq) Date: Wed, 19 Aug 1998 23:28:31 -0500 Subject: What have neural networks achieved? Message-ID: <35DBA5EF.FCB1B000@ix.netcom.com> I am part of ACES (http://www.riyadhweb.com/aces) , a telecommunications and electronics company based in Saudi Arabia. Recently we sold Lucent Technology (AT&T, formerly) a multi-million system to test the Saudi GSM mobile network quality on the voice level (not bit-error rate level). This system is made by ASCOM (http://www.ascom.ch/qvoice) and worldwide accepted. It uses the neural network-fuzzy techniques to assess the quality of the received voice signal without paying the overhead of sending real people in the field to measure the quality. As ASCOM puts it: Using advanced neural networking technology, the fully automatic system is trained to map speech patterns, and to produce quality ratings which correlate over 98% to those produced "unconsciously" by the combination of the human ear and brain. (For technical details about this system and other related systems, see http://www.ascom.ch/qvoice/qos/qqos0000.htm and http://www.ascom.ch/qvoice/car/qcar0000.htm) Hope this information is useful. Khalid Al-Mashouq Visiting professor at CMU, Pittsburgh. King Saud University Riyadh, Saudi Arabia http://www.angelfire.com/ok/almashouq From Yves.Moreau at esat.kuleuven.ac.be Thu Aug 20 11:59:56 1998 From: Yves.Moreau at esat.kuleuven.ac.be (Yves Moreau) Date: Thu, 20 Aug 1998 17:59:56 +0200 Subject: What have neural networks achieved? References: <3.0.1.32.19980819125843.00706c44@maxwell.ee.washington.edu> Message-ID: <35DC47FC.C2476BBA@esat.kuleuven.ac.be> Dear Connectionnists, I would like to point at the homepage of the European project SIENA, which has reviewed applications of neural networks in Europe and presents a number of case studies: http://www.mbfys.kun.nl/snn/siena/cases/ http://www.augusta.co.uk/siena/ Yves Moreau Department of Electrical Engineering Katholieke Universiteit Leuven Kardinaal Mercierlaan 94 B-3001 Leuven Belgium moreau at esat.kuleuven.ac.be Case studies of successful applications ------------------------- Benelux Prediction of Yarn Properties in Chemical Process Technology Current Prediction for Shipping Guidance in IJmuiden Recognition of Exploitable Oil and Gas Wells Modelling Market Dynamics in Food-, Durables- and Financial Markets Prediction of Newspaper Sales Production Planning for Client Specific Transformers Qualification of Shock-Tuning for Automobiles Diagnosis of Spot Welds Automatic Handwriting Recognition Automatic Sorting of Pot Plants Spain/Portugal Fraud detection in credit card transactions Drinking Water Supply Management On-line Quality Modelling in Polymer Production Neural OCR Processing of Employment Demands Neural OCR Personnel Information Processing at Madrid's Delegacion Provincial de Educacion Neural OCR Processing of Sales Orders Neural OCR Processing of Social Security Forms Germany/Austria Predicting Sales of Articles in Supermarkets Automatic Quality Control System for Tile-making Works HERACLAS - Quality Assurance by "listening" Optimizing Facilities for Polymerization Quality Assurance and Increased Efficiency in Medical Projects Classification of Defects in Pipelines A New Method for Computer Assisted Prediciton of Lymphnode-Metastasis in Gastric Cancer Alarm Identification with SENECA Facilities for Material-Specific Sorting and Selection Optimized Dryer-Regulation Application of Neural Networks for Evaluating the Reaction State of Penicillin-Fermenters Substitution of Analysers in Distillation Columns Optical Positioning in Industrial Production Short-Term Load Forecast for German Power Utility Monitoring of Water Dam ZN-Face: Access Control Using Automated Face Recognition Control of Tempering Furnaces France/Italy Helicopter Flight Data Analysis (HFDA) Neural Forecaster for On-line Load Profile Correction UK/Scandinavia For more than 30 UK case studies we refer to the applications portfolio at DTI's NeuroComputing Web ---------------------------------------------------------------- > >Michael Arbib wrote: > > > >> > >> b) What are the "big success stories" (i.e., of the kind the general public > >> could understand) for neural networks contributing to the construction of > >> "artificial" brains, i.e., successfully fielded applications of NN hardware > >> and software that have had a major commercial or other impact? > >> > >> > >> > >> ********************************* > >> Michael A. Arbib > >> USC Brain Project > >> University of Southern California > >> Los Angeles, CA 90089-2520, USA > >> arbib at pollux.usc.edu > >> (213) 740-9220; Fax: 213-740-5687 > >> http://www-hbp.usc.edu/HBP/ > >> > >> > > From stefan.wermter at sunderland.ac.uk Thu Aug 20 15:10:21 1998 From: stefan.wermter at sunderland.ac.uk (Stefan Wermter) Date: Thu, 20 Aug 1998 20:10:21 +0100 Subject: Connectionist symbol processing: any progress Message-ID: <35DC749D.5D7E2E33@sunderland.ac.uk> Jamie Henderson writes: > - Connectionist approaches to processing structural information have made > significant progress, to the point that they can now be justified on > purely empirical/engineering grounds. > - Connectionist methods do solve problems that current non-connectionist > methods have (ad-hoc independence assumptions, sparse data, etc.), > and people working in learning know it. > - Connectionist NLP researchers should be using modern empirical methods, > and they will be taken seriously if they do. I would support Jamie Hendersons view. While it might have been the state of the art 10 years ago to focus on small single networks for toy tasks in isolation, there has been an interesting development of using connectionist networks not only for cognitive modeling but for instance for (language) engineering. Larger modular architectures have been explored (for instance there were several recent issues on modular architectures in the journal connection science, guest-edited by A. Sharkey) and neural networks might also be used in context with other modules in larger systems. And it is useful and necessary to compare with traditional well-known techniques, e.g. n-grams, etc In some of our recent work on the screen system for instance, we have processed speech input from acoustics over syntax and semantics up to dialog levels based on two corpora of several thousand words. All the main processing could be done with neural networks in a modular architecture for a speech/language system. So connectionist techniques are not only useful for modeling specific cognitive constraints well but can also be used successfully for larger tasks like learning text tagging or learning spoken language analysis. Below some references if interested. Wermter S., Weber, V. 1997. SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks, Journal of Artificial Intelligence Research 6(1) p. 35-85 Wermter, S. Riloff, E. Scheler, G. (Ed). 1996. Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing Springer Verlag, Berlin. Wermter S., Meurer M. 1997. Building lexical representations dynamically using artificial neural networks. Proceedings of the International Conference of the Cognitive Science Society, p. 802-807, Stanford. I would be interested to hear if anybody working on neural network techniques has recently developed MODULAR neural techniques in other fields, e.g. for integrating vision and speech, data/text mining, intelligent controllers, learning web agents, neuro-fuzzy reasoning, information extraction and information retrieval or other forms of intelligent processing. best wishes, Stefan ******************************************** Professor Stefan Wermter Research Chair in Intelligent Systems University of Sunderland Dept. of Computing & Information Systems St Peters Way Sunderland SR6 0DD United Kingdom phone: +44 191 515 3279 fax: +44 191 515 2781 email: stefan.wermter at sunderland.ac.uk http://osiris.sunderland.ac.uk/~cs0stw/ ******************************************** From ted.carnevale at yale.edu Thu Aug 20 18:02:35 1998 From: ted.carnevale at yale.edu (Ted Carnevale) Date: Thu, 20 Aug 1998 18:02:35 -0400 Subject: NEURON course at SFN98 Message-ID: <35DC9CFB.6BD6@yale.edu> Short course announcement Title: Using the NEURON Simulation Environment What, where, and when: This is a Satellite Symposium that will be presented at the Society for Neuroscience meeting in Los Angeles, CA, on Saturday, Nov. 7, 1998. Speakers: N.T. Carnevale, M.L. Hines, J.W. Moore, and G.M. Shepherd Description NEURON, which runs under UNIX/Linux, MSWindows, and MacOS, is an advanced simulation environment that handles realistic models of biophysical mechanisms, individual neurons, and networks of cells. In lectures with live computer demonstrations, this course will present information essential for teaching and research applications of NEURON. It will emphasize practical issues that are key to the most productive use of this powerful and convenient modeling tool. Partial list of topics to be covered: * How NEURON separates biologically important features, such as spatio-temporal complexity, from purely numerical concerns like accuracy and stability. * Efficient design, implementation, and use of models, including variable-order variable-timestep integration, and NEURON's latest tools for network simulations. * Using the graphical interface to control and modify simulations, and to analyze data and simulation results without additional programming. * Getting the most out of special features such as vectors and the extensive function library. * Expanding NEURON's repertoire of biophysical mechanisms. * Using NEURON simulations in neuroscience education. * Databases for empirically-based modeling. Each registrant will receive a CD-ROM with software, plus a comprehensive set of notes that includes material which has not yet appeared elsewhere in print. Registration is limited to 55 individuals on a first-come, first-serve basis. Early registration deadline Friday, October 2, 1998 Late registration deadline Friday, October 16, 1998 NO on-site registration will be accepted. For more information and the electronic registration form, see http://www.neuron.yale.edu/sfn98.html --Ted From l.s.smith at cs.stir.ac.uk Fri Aug 21 12:22:37 1998 From: l.s.smith at cs.stir.ac.uk (Dr L S Smith (Staff)) Date: Fri, 21 Aug 98 17:22:37 +0100 Subject: Book from EWNS1 Neuromorphic Systems Workshop Message-ID: <199808211622.RAA26776@tinker.cs.stir.ac.uk> Title: Neuromorphic Systems: Engineering Silicon from Neurobiology Editors: L.S. Smith and A. Hamilton Publisher: World Scientific. 260 pages. Series: Progress in Neural Processing 10. ISBN: 981 02 3377 9 This book is the refereed proceedings of the 1st European Workshop on Neuromorphic Systems, held in August 1997, at Stirling, Scotland. Further details of the book may be found at the www page of the conference, http://www.cs.stir.ac.uk/~lss/Neuromorphic/Info1 or at the publishers site http://www.wspc.com.sg/books/compsci/3702.html *note* that the 2nd European Workshop on Neuromorphic Systems will be held at Stirling, Scotland from 3-5 September 1999. No www page yet. Dr Leslie S. Smith Dept of Computing Science and Mathematics, Univ of Stirling Stirling FK9 4LA, Scotland l.s.smith at cs.stir.ac.uk (NeXTmail and MIME welcome) Tel (44) 1786 467435 Fax (44) 1786 464551 www http://www.cs.stir.ac.uk/~lss/ From juergen at idsia.ch Fri Aug 21 12:28:52 1998 From: juergen at idsia.ch (Juergen Schmidhuber) Date: Fri, 21 Aug 1998 18:28:52 +0200 Subject: PhD student jobs Message-ID: <199808211628.SAA04263@ruebe.idsia.ch> ******** ETH Zurich and IDSIA in Lugano (Switzerland) ******* PhD student positions We are seeking two outstanding PhD candidates for an exciting research project that combines machine learning (reinforcement learning, evolutionary computation, neural nets) and computational fluid dynamics. We intend to tackle problems such as drag minimisation, noise control, etc, using innovative control devices such as synthetic actuators, active skins, etc. This is a joint project of the Institute of Fluid Dynamics at ETH Zurich and the machine learning research institute IDSIA in Lugano (IDSIA ranked among the world's top ten AI labs in the 1997 "X-Lab Survey" by Business Week Magazine). Both are located in Switzerland, origin of the WWW and country with highest citation impact factor as well as most Nobel prizes and supercomputing capacity per capita. We will maintain very active links to Fluid Dynamics and AI institutes at Stanford University and NASA Ames Research Center. We offer an attractive Swiss PhD student salary. Highly qualified candidates are sought with a background in computational sciences, engineering, mathematics, physics or other relevant areas. Applicants should submit : (i) Detailed curriculum vitae, (ii) List of three references (and their email addresses), (ii) Transcripts of undergraduate and graduate (if applicable) studies and (iii) Concise statement of their research interests (two pages max). Candidates are also encouraged to submit their scores in the Graduate Record Examination (GRE) general test (if available). Please send all documents to: Petros Koumoutsakos, www.ifd.mavt.ethz.ch Institute for Fluid Dynamics ETH Zentrum, CH-8092, Zurich, Switzerland OR Juergen Schmidhuber www.idsia.ch IDSIA, Corso Elvezia 36, 6900-Lugano, Switzerland Applications (with WWW pointers to studies or papers, if available) can also be submitted electronically (in plain ASCII or postscript format) to petros at ifd.mavt.ethz.ch or juergen at idsia.ch Petros & Juergen From marshall at cs.unc.edu Fri Aug 21 12:25:25 1998 From: marshall at cs.unc.edu (Jonathan A. Marshall) Date: Fri, 21 Aug 1998 12:25:25 -0400 (EDT) Subject: New vision & pattern recognition papers available Message-ID: I would like to announce the availability of several new papers on vision, pattern recognition, and neural systems. These papers may be obtained from http://www.cs.unc.edu/~marshall --Jonathan A. Marshall marshall at computer.org Dept of Computer Science, Univ of North Carolina, Chapel Hill, NC, USA. Visionics Corp., Jersey City, NJ, USA. ---------------------------------------------------------------------------- Gupta VS, Alley RK, Marshall JA, "Development of triadic neural circuits for visual image stabilization under eye movements." Submitted for journal publication, July 1998. Human visual systems maintain a stable internal representation of a scene even though the image on the retina is constantly changing because of eye movements. Such stabilization can theoretically be effected by dynamic shifts in the receptive field (RF) of neurons in the visual system. This paper examines how a neural circuit can learn to generate such shifts. The shifts are controlled by eye position signals and compensate for the movement in the retinal image caused by eye movements. The development of a neural shifter circuit (Olshausen, Anderson, & Van Essen, 1992) is modeled using triadic connections. These connections are gated by signals that indicate the direction of gaze (eye position signals). In simulations, a neural model is exposed to sequences of stimuli paired with appropriate eye position signals. The initially nonspecific gating weights change, using a triadic learning rule. The pattern of gating develops so that different eye position signals selectively gate pathways from different positions within the visual field. Neurons then exhibit dynamic RF shifts, responding to the preferred stimulus within the RF and continuing to respond when the stimulus moves because of a shift in eye position. The triadic learning rule thus produces a shifter circuit that exhibits visual image stabilization. Traditional dyadic networks and learning rules do not produce such behavior. The self-organization capability of the model reduces the need for detailed pre-wiring or specific genetic programming of development. This shifter circuit model may also help in analyzing the behavior and formation of anticipatory RF shifts, which can reduce latency of visual response after eye movements, and attention-modulated changes in visual processing. ---------------------------------------------------------------------------- Marshall JA, Gupta VS, "Generalization and exclusive allocation of credit in unsupervised category learning." Network: Computation in Neural Systems 9:279-302, May 1998. A new way of measuring generalization in unsupervised learning is presented. The measure is based on an exclusive allocation, or credit assignment, criterion. In a classifier that satisfies the criterion, input patterns are parsed so that the credit for each input feature is assigned exclusively to one of multiple, possibly overlapping, output categories. Such a classifier achieves context-sensitive, global representations of pattern data. Two additional constraints, sequence masking and uncertainty multiplexing, are described; these can be used to refine the measure of generalization. The generalization performance of EXIN networks, winner-take-all competitive learning networks, linear decorrelator networks, and Nigrin's SONNET-2 network is compared. ---------------------------------------------------------------------------- Marshall JA, Schmitt CP, Kalarickal GJ, Alley RK, "Neural model of transfer-of-binding in visual relative motion perception." To appear in Computational Neuroscience: Trends in Research, 1998. January 1998. How can a visual system or cognitive system use the changing relationships between moving visual elements to decide which elements belong together as groups (or objects)? We have constructed a neural circuit model that selects object groupings based on global Gestalt common-fate evidence and uses information about the behavior of each group to predict the behavior of elements of the group. A simple competitive neural circuit binds elements into a representation of an object. Information about the spiking pattern of neurons allows transfer of the bindings of an object representation from location to location in the neural circuit as the object moves. The model exhibits characteristics of human object grouping and solves some key neural circuit design problems in visual relative motion perception. ---------------------------------------------------------------------------- Marshall JA, Srikanth V, "Curved trajectory prediction using a self-organizing neural network." Submitted for journal publication, September 1997. Existing neural network models are capable of tracking linear trajectories of moving visual objects. This paper describes an additional neural mechanism, disfacilitation, that enhances the ability of a visual system to track curved trajectories. The added mechanism combines information about an object's trajectory with information about changes in the object's trajectory, to improve the estimates for the object's next probable location. Computational simulations are presented that show how the neural mechanism can learn to track the speed of objects and how the network operates to predict the trajectories of accelerating and decelerating objects. ---------------------------------------------------------------------------- These five papers form part of Dr. George Kalarickal's recent dissertation: Kalarickal GJ, Theory of Cortical Plasticity in Vision. PhD Dissertation, Department of Computer Science, University of North Carolina at Chapel Hill, 1998. Kalarickal GJ, Marshall JA, "Comparison of generalized Hebbian rules for long-term synaptic plasticity." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "The role of afferent excitatory and lateral inhibitory synaptic plasticity in visual cortical ocular dominance plasticity." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "Plasticity in cortical neuron properties: Modeling the effects of an NMDA antagonist and a GABA agonist during visual deprivation." Submitted for journal publication, July 1998. Kalarickal GJ, Marshall JA, "Models of receptive field dynamics in visual cortex." Submitted for journal publication, May 1998. Kalarickal GJ, Marshall JA, "Rearrangement of receptive field topography after intracortical and peripheral stimulation: The role of plasticity in inhibitory pathways." Submitted for journal publication, July 1998. A theory of postnatal activity-dependent neural plasticity based on synaptic weight modification is presented. Synaptic weight modifications are governed by simple variants of a Hebbian rule for excitatory pathways and an anti-Hebbian rule for inhibitory pathways. The dissertation focuses on modeling the following cortical phenomena: long-term potentiation and depression (LTP and LTD); dynamic receptive field changes during artificial scotoma conditioning in adult animals; adult cortical plasticity induced by bilateral retinal lesions, intracortical microstimulation (ICMS), and repetitive peripheral stimulation; changes in ocular dominance during "classical" rearing conditioning; and the effect of neuropharmacological manipulations on plasticity. Novel experiments are proposed to test the predictions of the proposed models, and the models are compared with other models of cortical properties. The models presented in the dissertation provide insights into the neural basis of perceptual learning. In perceptual learning, persistent changes in cortical neuronal receptive fields are produced by conditioning procedures that manipulate the activation of cortical neurons by repeated stimulation of localized regions. Thus, the analysis of synaptic plasticity rules for receptive field changes produced by conditioning procedures that activate small groups of neurons can also elucidate the neural basis of perceptual learning. Previous experimental and theoretical work on cortical plasticity focused mainly on afferent excitatory synaptic plasticity. The novel and unifying theme in this work is self-organization and the use of the lateral inhibitory synaptic plasticity rule. Many cortical properties, e.g., orientation selectivity, motion selectivity, spatial frequency selectivity, etc. are produced or strongly influenced by inhibitory interactions. Thus, changes in these properties could be produced by lateral inhibitory synaptic plasticity. ---------------------------------------------------------------------------- From arobert at cogsci.ucsd.edu Fri Aug 21 15:33:46 1998 From: arobert at cogsci.ucsd.edu (Adrian Robert) Date: Fri, 21 Aug 98 12:33:46 -0700 Subject: What have neural networks achieved? References: Message-ID: <199808211933.MAA12922@briah.ucsd.edu> After Dr. Arbib's request everyone seems to have been coming up with commercial applications (using NNs as statistical analyzers) but as far as his first question -- about generation of insight into real brain function -- zero. Lest this be taken as a sinister sign in yet another area of neural network research, I hurry to mention the one major example that I'm familiar with -- that of understanding the influence of environmental input on cortical neural representations. The work I'm talking about is of course that done, starting with von der Malsburg and others in the 70's, given a fresh impulse by Linsker in the 80's, and most thoroughly connected to the biology by Miller in the 90's, on the development of orientation selectivity (and also maps of orientation selectivity and ocular dominance columns) in primary visual cortex. While anyone in the field will tell you that the final word has yet to be said, this work genuinely provides insight -- it shows how the important elements of a class of biological neural systems can be translated into mathematical terms and how observed results emerge naturally from this translation. You leave an encounter with it feeling you have really understood something about the way things work -- and, although these methods have only been applied to the first visual area in the cortex (for the most part), they are general enough that they provide more than an inkling about what must be happening further in. (Long way to go though!) There are other examples... Adrian From jagota at cse.ucsc.edu Fri Aug 21 22:09:09 1998 From: jagota at cse.ucsc.edu (Arun Jagota) Date: Fri, 21 Aug 1998 19:09:09 -0700 Subject: Record/archive of debate? Message-ID: <199808220209.TAA18579@arapaho.cse.ucsc.edu> It would be nice if some sort of a record of the "Connectionist Symbol Processing" debate were to be produced and archived for the benefit of the community. With this in mind I have a specific proposal (which ties in with an unusual idea I have thought about on occasion). I invite interested individuals (especially those who contributed to this e-debate) to consider contributing a brief survey of their relevant work (positive or negative) on Connectionist Symbol Processing, a brief survey of some other group's relevant work that they are well-acquainted with, or a brief summary of their position on this topic. The aim is to collect these contributions into an informal, survey-type, "distributed article". Contributors whose contributions would be accepted would become its co-authors (hence the term distributed). Such an article, individual contribution acceptance and "article-wide review for improvement" processes yet to be determined, I would aim to have archived in Neural Computing Surveys. If you are seriously interested in contributing, inform me of your intent to contribute by e-mail (jagota at cse.ucsc.edu). Whether we proceed to the implementation phase will depend on the feedback I receive. I'd expect contributions to range from half a page to a page. Contributors to this planned article who contributed e-mail messages to this debate might well send polished versions of their e-mail messages (but clearly not those of others) as contributions. Also, plain text contributions will normally suffice. (My aim is to make it as easy as possible for you to contribute what specialized knowledge you have access to, towards this hopefully community-benefiting article. Having said that, there will be some sort of individual contribution review/acceptance threshold and subsequent article-wide review.) I will not be a author. For comments/questions, contact me directly, NOT the connectionists list. Arun Jagota jagota at cse.ucsc.edu From juergen at idsia.ch Sat Aug 22 06:18:50 1998 From: juergen at idsia.ch (Juergen Schmidhuber) Date: Sat, 22 Aug 1998 12:18:50 +0200 Subject: recent debate Message-ID: <199808221018.MAA04799@ruebe.idsia.ch> A side note on what Jon Baxter wrote: > In contrast, in a "learning to learn" setting where a learner is faced > with a (potentially infinite) *sequence* of learning tasks ... A more appropriate name for this is "inductive transfer." The traditional meaning of "learning to learn" is "learning learning algorithms." It refers to systems that search a space whose elements are credit assignment strategies, and is conceptually independent of whether or not there are different tasks. For instance, in contexts where there is only one task (such as receiving a lot of reward over time) the system may still be able to "learn to learn" by using experience for continually improving its learning algorithm (more on this in my home page). A note on what Bryan Thompson wrote: > If we consider that the primary mechanism of recurrence in a > distributed representations as enfolding space into time, I still have > reservations about the complexity that the agent / organism faces in > learning an enfolding of mechanisms sufficient to support symbolic > processing. There is a recurrent net method called "Long Short-Term Memory" (LSTM) which does not require "enfolding space into time". LSTM's learning algorithm is local in both space and time (unlike BPTT's and RTRL's). Despite its low computational complexity LSTM can learn algorithmic solutions to many "symbolic" and "subsymbolic" tasks (according to the somewhat vague distinctions that have been proposed) that BPTT/RTRL and other existing recurrent nets cannot learn: Sepp Hochreiter and J. Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735-1780, 1997 Juergen Schmidhuber, IDSIA www.idisia.ch/~juergen From tabor at CS.Cornell.EDU Sat Aug 22 15:04:49 1998 From: tabor at CS.Cornell.EDU (Whitney Tabor) Date: Sat, 22 Aug 1998 15:04:49 -0400 (EDT) Subject: Connectionist symbol processing---any progress? Message-ID: <199808221904.PAA04651@gibson.cs.cornell.edu> David Touretzky wrote: > I concluded that connectionist symbol processing had reached a > plateau, and further progress would have to await some revolutionary > new insight about representations. and mentioned Tony Plate's inspiring work on Holographic Reduced Representations (HRRs). I don't think that anybody has yet mentioned the related line of work which sets aside the learning problem for the moment and focuses on the geometry of the trajectories of metric (and vector) space computers, including many connectionist networks. One essential idea (proposed in embryonic form in Pollack, 1987, 1991, Siegelmann and Sonntag, 1991, Wiles and Elman, 1995, Rodriguez, Wiles, and Elman, to appear) is to use fractals to organize recursive computations in a bounded metric space. Cris Moore (Moore, 1996) provides the first substantial development of this idea, relating it to the traditional practice of classifying machines based on their computational power. He shows, for example, that every context free language can be recognized by some "dynamical recognizer" that moves around on an elaborated, one-dimensional Cantor Set. I have described a similar method which operates on high-dimensional Cantor sets and thus leads to an especially natural implementation in neural hardware (Tabor, 1998, submitted). This approach sheds some new light on the symbolic vs. metric space computation question by showing how we can use the structured entities recognized by traditional computational theory (e.g. particular context free grammars) as bearing points in navigating the larger set (Siegelman, 1996; Moore, 1996) of computing devices embodied in many analog computers. To my knowledge, no one has tried to use this kind of computational/geometric perspective to interpret HRRs and related outer product representations---I think this would be a very rewarding project. Whitney Tabor University of Connecticut http://www.cs.cornell.edu/home/tabor/tabor.html @UNPUBLISHED{Pollack:87, AUTHOR = {Jordan B. Pollack}, TITLE = {On Connectionist Models of Natural Language Processing}, NOTE = {Ph.D. Thesis, Department of Computer Science, University of Illinois}, YEAR = {1987}, } @ARTICLE{S&S:91, AUTHOR = {H. T. Siegelmann and E. D. Sontag}, TITLE = {Turing computability with neural nets}, JOURNAL = {Applied Mathematics Letters}, YEAR = {1991}, VOLUME = {4}, NUMBER = {6}, PAGES = {77-80}, } @ARTICLE{Pollack:91, AUTHOR = {Jordan B. Pollack}, TITLE = {The Induction of Dynamical Recognizers}, JOURNAL = {Machine Learning}, YEAR = {1991}, VOLUME = {7}, PAGES = {227-252}, } @INCOLLECTION{W&E:95, AUTHOR = {Janet Wiles and Jeff Elman}, TITLE = {Landscapes in Recurrent Networks}, BOOKTITLE = {Proceedings of the 17th Annual Cognitive Science Conference}, EDITOR = {Johanna D. Moore and Jill Fain Lehman}, PUBLISHER = {Lawrence Erlbaum Associates}, YEAR = {1995}, } @ARTICLE{R&W&E:ta, AUTHOR = {Paul Rodriguez and Janet Wiles and Jeffrey Elman}, TITLE = {How a Recurrent Neural Network Learns to Count}, JOURNAL = {Connection Science}, YEAR = {ta}, VOLUME = {}, NUMBER = {}, PAGES = {}, } @UNPUBLISHED{Moore:96b, AUTHOR = {Christopher Moore}, TITLE = {Dynamical Recognizers: Real-time Language Recognition by Analog Computers}, NOTE = {TR No. 96-05-023, Santa Fe Institute}, YEAR = {1996}, } @ARTICLE{Siegelmann:96, AUTHOR = {Hava Siegelmann}, TITLE = {The simple dynamics of super {T}uring theories}, JOURNAL = {Theoretical Computer Science}, YEAR = {1996}, VOLUME = {168}, PAGES = {461-472}, } @UNPUBLISHED{Tabor:98, AUTHOR = {Whitney Tabor}, TITLE = {Dynamical Automata}, NOTE = {47 pages. Tech Report \# 98-1694. Department of Computer Science, Cornell University. Download from http://cs-tr.cs.cornell.edu/}, YEAR = {1998}, } @UNPUBLISHED{Tabor:subb, AUTHOR = {Whitney Tabor}, TITLE = {Context Free Grammar Representation in Neural Networks}, NOTE = {7 pages. Draft version available at http://simon.cs.cornell.edu/home/tabor/papers.html}, YEAR = {submitted to NIPS}, } From arbib at pollux.usc.edu Mon Aug 24 02:01:18 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Sun, 23 Aug 1998 22:01:18 -0800 Subject: What have neural networks achieved? Message-ID: >From: Adrian Robert >Date: Fri, 21 Aug 98 12:33:46 -0700 > >... everyone seems to have been coming up with >commercial applications (using NNs as statistical analyzers) but as far as >his >first question -- about generation of insight into real brain function -- >zero. My thanks to Adrian for reminding you of this question! For example, I think that Houk and Barto, Kawato, and my group (Schweighofer and Spoelstra) have begun to make real progress in elucidating the role of cerebellum in motor control. So: I would like to see responses of the form: "Models A and B have shown the role of brain regions C and D in functions E and F - see specific references G and H". The real interest comes when claims appear to conflict: Can we unify theories on the roles of cerebellum in both motor control and classical conditioning? What about the role of hippocampus in both spatial navigation and consolidation of short term memory? Thanks again, Adrian Robert! ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From bmg at cs.rmit.edu.au Sun Aug 23 10:49:54 1998 From: bmg at cs.rmit.edu.au (B Garner) Date: Mon, 24 Aug 1998 00:49:54 +1000 (EST) Subject: Record/archive of debate? Message-ID: <199808231449.AAA02154@numbat.cs.rmit.edu.au> * * It would be nice if some sort of a record of the "Connectionist * Symbol Processing" debate were to be produced and archived for * the benefit of the community. * I think this would be a good idea.. because there were so many interesting ideas expressed. I recently published 2 training algorithms which I call symbolic they are found at http://yallara.cs.rmit.edu.au/~bmg/algA.ps and algB.ps I have included the abstracts below. Although I say these algorithms are symbolic because they train the networks without finding a numerical solution, because sets of constraints are derived during training. These constraints show that the weights and the thresholds are all in relationship to each other at each 'neuron'. I have thought a lot about what 'symbol' means, and I have decided, largely, that 'symbol is something that takes its meaning from those symbols "around" it'. Perhaps there are better definitions because this one is self-referential. But this idea of symbol is close to, apparently, structural linguistics. Conveniently, perhaps you might think, this idea supports the results of my training algorithms. I have read some of the argument in this debate where someone said that the topology of the input space needs to be examined. With my second algorithm, once you know the topology of the input space the problem can be transformed and learnt very simply. Even problems such as the twin spiral problem can be learnt with one hidden layer. These algorithms are very simple, but I haven't finished writing up all my results I have yet. Here are the abstracts: A symbolic solution for adaptive feedforward neural networks found with a new training algorithm B. M. Garner, Department of Computer Science, RMIT, Melbourne, Australia. ABSTRACT Traditional adaptive feed forward neural network (NN) training algorithms find numerical values for the weights and thresholds. In this paper it is shown that a NN composed of linear threshold gates (LTGs) can function as a fully trained neural network without finding numerical values for the weights and thresholds. This surprising result is demonstrated by presenting a new training algorithm for this type of NN that resolves the network into constraints which describes all the numeric values the NN's weights and thresholds can take. The constraints do not require a numerical solution for the network to function as a fully trained NN which can generalize. The solution is said to be symbolic as a numerical solution is not required. ************************************************************************** A training algorithm for Adaptive Feedforward Neural Networks that determines its own topology B. M. Garner, Department of Computer Science, RMIT, Melbourne, Australia. ABSTRACT There has been some interest in developing neural network training algorithms that determine their own architecture. A training algorithm for adaptive feedforward neural networks (NN) composed of Linear Threshold Gates (LTGs) is presented here that determines it's own architecture and trains in a single pass. This algorithm produces what is said to be a symbolic solution as it resolves the relationships between the weights and the thresholds into constraints which do not require to be solved numerically. The network has been shown to behave as a fully trained neural network which generalizes and the possibility that the algorithm has polynomial time complexity is discussed. The algorithm uses binary data during training. From sml at essex.ac.uk Mon Aug 24 03:49:17 1998 From: sml at essex.ac.uk (Simon Lucas) Date: Mon, 24 Aug 1998 08:49:17 +0100 Subject: Connectionist symbol processing Message-ID: <35E11AFD.5232DE9A@essex.ac.uk> I would suggest that most recurrent neural net architectures are not fundamentally more 'neural' than hidden Markov models - think of an HMM as a neural net with second-order weights and linear activation functions. HMMs are, of course, very much alive and kicking, and routinely successfully applied to problems in speech and OCR for example. It might be argued that the HMMs tend to employ less distributed representations than RNNs, but even if this is true, so what? Some interesting work that has explored links between the two: @ARTICLE{Bridle-alpha-net, AUTHOR = "Bridle, J.S.", TITLE = "Alpha-nets: a recurrent ``neural'' network architecture with a hidden Markov model interpretation", JOURNAL = "Speech Communication", YEAR = "(1990)", VOLUME = 9, PAGES = "83 -- 92"} @ARTICLE{Bengio-iohmm, AUTHOR = "Bengio, Y and Frasconi, P", TITLE = "Input-output HMMs for sequence processing", JOURNAL = "IEEE Transactions on Neural Networks", YEAR = "(1996)", VOLUME = 7, PAGES = "1231 -- 1249"} Also related to the discussion is the Syntactic Neural Network (SNN) - an architecture I developed in my PhD thesis (refs below). The SNN is a modular architecture that is able to parse and (in some cases) infer context-free (and therefore also regular, linear etc) grammars. The architecture is composed of Local Inference Machines (LIMs) that rewrite pairs of symbols. These are then arranged in a matrix parser formation (see Younger1967) to handle general context-free grammars - or we can alter the SNN macro-structure in order to specifically deal with simpler classes of grammar such as regular, strictly-hierarchical or linear. The LIM remains unchanged. In my thesis I only developed a local learning rule for the strictly-hierarchical grammar, which was a specialisation of the Inside/Outside algorithm for training stochastic context-free grammars. By constructing the LIMs from forward-backward modules (see Lucas-fb) however, any SNN that you construct automatically has an associated training algorithm. I've already proven this to work for regular grammars, I'm now in the process of testing some other cases - I'll post the paper to this group when its done. refs: @ARTICLE{Younger1967, AUTHOR = "Younger, D.H.", TITLE = "Recognition and parsing of context-free languages in time $n^{3}$", JOURNAL = "Information and Control", VOLUME = 10, NUMBER = 2, PAGES = "189 -- 208", YEAR = "(1967)"} @ARTICLE{Lucas-snn1, AUTHOR = "Lucas, S.M. and Damper, R.I.", TITLE = "Syntactic neural networks", JOURNAL = "Connection Science", YEAR = "(1990)", VOLUME = "2", PAGES = "199 -- 225"} @ARTICLE{Lucas-phd, AUTHOR = "Lucas, S.M.", TITLE = "Connectionist Architectures for Syntactic Pattern Recognition", JOURNAL = "PhD Thesis, University of Southampton", YEAR = "(1991)"} ftp://tarifa.essex.ac.uk/images/sml/reports/fbnet.ps @INCOLLECTION{Lucas-fb, AUTHOR = "Lucas, S.M.", TITLE = "Forward-backward building blocks for evolving neural networks with intrinsic learning behaviours", BOOKTITLE = "Lecture Notes in Computer Science (1240): Biological and artificial computation: from neuroscience to technology", YEAR = "(1997)", PUBLISHER = "Springer-Verlag", PAGES = "723 -- 732", ADDRESS = "Berlin"} ------------------------------------------------ Simon Lucas Department of Electronic Systems Engineering University of Essex Colchester CO4 3SQ United Kingdom Tel: (+44) 1206 872935 Fax: (+44) 1206 872900 Email: sml at essex.ac.uk http://esewww.essex.ac.uk/~sml secretary: Mrs Wendy Ryder (+44) 1206 872437 ------------------------------------------------- From michael.j.healy at boeing.com Mon Aug 24 14:25:57 1998 From: michael.j.healy at boeing.com (Michael J. Healy 425-865-3123) Date: Mon, 24 Aug 1998 11:25:57 -0700 Subject: Connectionist symbolic processing Message-ID: <199808241825.LAA15169@lilith.network-b> I've been doing research in connectionist symbol processing for some time, so I'd like to contribute something to the discussion. I'll try to keep it brief and just say what I'm ready to say. I am not prepared to address Michael Arbib's question about real brain function at this time, although it's possible to make a connection. First, here are some references to the literature of rule extraction with neural networks, which I have been following. The list omits a lot of good work, but is meant to be representative: Andrews, R., Diederich, J. & Tickle, A. B. (1995) "Survey and critique of techniques for extracting rules from trained artificial neural networks", Knowledge-Based Systems, vol. 8, no. 6, 373-389. Craven, M. W. & Shavlik, J. W. (1993) "Learning Symbolic Rules using Artificial Neural Networks", Proceedings of the 10th International Machine Learning Conference, Amherst, MA. 73-80. San Mateo, CA:Morgan Kaufmann. Healy, M. J. & Caudell, T. P. (1997) "Acquiring Rule Sets as a Product of Learning in a Logical Neural Architecture", IEEE Transactions on Neural Networks, vol. 8, no. 3, 461-475. Kasabov, N. K. (1996) "Adaptable neuro production systems", Neurocomputing, vol. 13, 95-117. Setiono, R. (1997) "Extracting Rules from Neural Networks by Pruning and Hidden-Unit Splitting", Neural Computation, vol. 9, no. 1, 205-225. Sima, J. (1995) "Neural Expert Systems", Neural Networks, vol. 8, 261-271. Most of the work is empirical, but is accompanied by analyses of the practical aspects of extracting knowledge from data and of incorporating pre-existing knowledge along with the extracted knowledge. The supposed knowledge here is mostly in the form of if-then rules which, to greater or lesser extent, represent propositional statements. There is also some recent work on mathematically formalizing connectionist symbolic computations, for example: Pinkas, G. (1995) "Reasoning, nonmonotonicity and learning in connectionist networks that capture propositional knowledge", Artificial Intelligence 77, 203-247. I've been developing a formal semantic model for neural networks--- a mathematical model of concept representation in connectionist memories and learning by connectionist systems. I've found that such a model requires an explicit semantic model, in which the "universe" of things the concepts are about receives as much attention in the mathematical model as the concepts themselves. I think this is essential for resolving the ambiguities that crop up in discussions about symbolic processing and neural networks. For example, it allows me to make some statements about issues brought up in the discussion of connectionist symbol processing. Whether you agree with me or not, I'd certainly be interested in further discussion. I've been concentrating on geometric logic and its model theory (different sense of the word "model"), mostly (so far) in the form of point-set topology. The set-theoretic form is the simple version of the semantics of geometric logic. It's really a categorical logic, so the full semantic model requires category theory. Geometric logic is very strict in what it takes to assert a statement. It is meant to represent observational statements, ones whose positive instances can be observed. Topology is commonly studied in its point-set version, but the categorical form is better for formal semantics. Having said that, I'll stick with sets in the following. Also, I'll refer to the models of a theory as its instances. My main finding to date is that a sound and complete rule base--- one in which the rules are actually valid for all the data and which has all the rules---has the semantics of a continuous function between the right topological spaces. This requires some explaining, not only the "all the rules" and "right topological spaces" business, but also the statement about continuous functions. For most, continuity means continuous functions on the real or complex numbers, or on vector spaces over same. But those are a special case: the topologies and continuous functions I work with also involve spaces normally represented by discrete- valued variables. Continuity is really the mathematical way of saying "similar things map to similar things". My first publication on this has some details (a more extensive treatment is to appear): M. J. Healy, Continuous Functions and Neural Network Semantics, Proc. of Second World Cong. of Nonlinear Analysts (WCNA96), Athens. In Nonlinear Analysis, Volume 30, issue #3, 1997. pp. 1335-1341 In geometric logic, a continuous function is two functions---a mapping from the instances (worlds, states, models) of theory A to the instances of theory B, and an associated mapping from the formulas of theory B to those of theory A. Without going into too much detail, the topological connection is that a set of things that satisfy a formula (instances of the formula) form an open set in a particular topological space. In the applications we often deal with, the training examples for a neural network are instances of a theory of the domain of application. A formula in the theory expresses a property of or a relation between instances. The instances are called "points" of the space, and the corresponding open set contains the points. Finite conjunctions of formulas correspond to the finite intersections of open sets, and we allow arbitrary disjunctions, corresponding to the unions (arbitrary disjunctions are appropriate for observations). There is a little more to it, because instead of the usual set unions we use unions over directed sets of subsets. A valid and complete rule base can be refined to have the form of the formula mapping half of a continuous function from space A (theory A and its instances, with the induced topology) to space B (as a special case, the two spaces can be the same, or can have the same points). Correspondingly, the open set for the antecedent of each refined rule is the inverse image under the point mapping of the open set for its consequent. The refinement is obtained by forming the disjunction of all antecedents with the same consequent. The points mapping of the continuous function expresses the fact that every instance of the antecedent of a rule must map to an instance of the consequent of the rule, where the rule expresses truth-preservation. This mathematical model relates directly to the work being done in rule extraction, even with the many different approaches and neural network models in use. Furthermore, I think it supports intuition, but I'd like you to be the judge. One thing I'd like to add is that the topological model is consistent with probabilistic modeling and fuzzy logic. The focus of this model really is upon semantics (or semiotics, if this is regarded as a model of sign-meaning relationships; I am mostly interested in the semantics). Finally, I'd like to comment upon an important issue that has appeared in this thread---how important is the input space topology (metric, structure, theory, ... )? I apologize if I've misiniterpreted any of what's been said, but here's my two cents. I don't think there is always a single "right" topological space. The form of the data and how you handle it depends on what assumptions were made in working up the data for presentation as training (or testing) examples for the neural network. Formalizing, I would say that the assumptions yield a theory about the domain of inputs, and this in turn yields a topology. The topology does not have to be induced by a metric, not unless you make the assumption that distances between data points (in the metric sense) are valid. For example, if you have applied a Euclidean clustering algorithm, you have implicitly made the assumption that the Euclidean metric is the semantics of the application items that are being encoded as data items. What you get will be partly a result of that assumption. But what you get also depends upon the assumptions underlying the algorithm. If the algorithm is really coming up with anything new, it will impose a new topology upon the data. For example, a Euclidean clustering algorithm doesn't return all the open balls of the Euclidean-induced topology---it returns a finite set of class representations. However, you'd like the final result to have some connection with your original interpretation of the data, since after all that was your way of seeing the application. So, it would be nice to have continuity, meaning that every instance of the input domain theory maps to an instance of the output theory in a manner consistent with the formulas (open sets) in both theories (topologies). An advantage of the continuous function model here is that it tells me what I need to do: Modify the topologies (hence the theories) so that the inverse of an open set is open. Of course, that's only a mathematical abstraction, and the question is still So what do you do? Well, I don't think you want to discard the input topology outright, for the reason I gave: It is the theory that gave you your data. But you can modify it if need be. If you assumed a metric and your final classification result (assuming you were doing classification or clustering) has an output domain consisting of metric-induced open sets, you need do nothing. You can get more information by going to a more sophisticated pair of spaces by an embedding, but at least your algorithm gave you classes that projected back into the input topology, so you're OK there. However, for many data and machine models, the input space (or the output space or both) won't accept the projections gracefully, so you need to do something. One thing you can do is suppose you have the wrong learning algorithm and try to find one that will automatically yield continuity without changing the input space. Another thing you can do is suppose that the algorithm is telling you something about the input data space, and modify the topology as needed to accept the new open sets (extend the sub-base of the topology). See how your application looks now! How you proceed from here depends upon what kinds of properties you want to study. What I'm proposing is that the topological model is good as a guide for further work because of its mathematical precision in semantic modeling. Regards, Mike Healy -- =========================================================================== e Michael J. Healy A FA ----------> GA (425)865-3123 | | FAX(425)865-2964 | | Ff | | Gf c/o The Boeing Company | | PO Box 3707 MS 7L-66 \|/ \|/ Seattle, WA 98124-2207 ' ' USA FB ----------> GB -or for priority mail- e "I'm a natural man." 2760 160th Ave SE MS 7L-66 B Bellevue, WA 98008 USA michael.j.healy at boeing.com -or- mjhealy at u.washington.edu ============================================================================ From adr at nsma.arizona.edu Mon Aug 24 17:35:10 1998 From: adr at nsma.arizona.edu (David Redish) Date: Mon, 24 Aug 1998 14:35:10 -0700 Subject: What have neural networks achieved? In-Reply-To: Your message of "Sun, 23 Aug 1998 22:01:18 PST." Message-ID: <199808242135.OAA20708@cortex.NSMA.Arizona.EDU> Michael Arbib wrote: >So: I would like to see responses of the form: >"Models A and B have shown the role of brain regions C and D in functions E >and F - see specific references G and H". >The real interest comes when claims appear to conflict: >Can we unify theories on the roles of cerebellum in both motor control and >classical conditioning? >What about the role of hippocampus in both spatial navigation and >consolidation of short term memory? In terms of the role of the hippocampus, a number of conflicting hypotheses have recently been shown not to be incompatible through computational modeling. The two major theories that have been argued over for the last twenty-plus years are (1) that the hippocampus forms a cognitive map for navigation (e.g. O'Keefe and Nadel, 1978) and (2) that the hippocampus stores episodic memories temporarily and replays them for consolidation into cortex (e.g. Cohen and Eichenbaum, 1993). We (David Touretzky and I, see Touretzky and Redish, 1996; Redish and Touretzky 1997, Redish 1997) examined the role of the hippocampus in the navigation domain by looking at the whole rodent navigation system (thereby attempting to put the role of the hippocampus in an anatomical context of a greater functional system). By looking at computational complexities and extensive simulations, we determined that the most likely role of the hippocampus in navigation is to allow an animal to reset an internal coordinate system on re-entry into an environment (i.e. to *self-localize* on returning to an environment). (From this theory we predicted that hippocampal lesions should not affect the ability of animals to wander out and return to a starting point, an ability called path integration which had previously been hypothesized to be hippocampally-dependent. This prediction has been borne out by recent experiments, Alyan and McNaughton, 1997). It is straight-forward to extend this idea of self-localization to a "return to context" which explains a large literature of primate data (Redish, 1999). In addition to the self-localization role, the hippocampus has been shown to replay recently traveled routes during sleep (Skaggs and McNaughton, 1996). However, the mechanisms that have been proposed to accomplish these two functions require incompatible connection matrices. Self-localization requires a symmetric component and route-replay requires an asymmetric component. We showed (Redish and Touretzky, 1998) that with the incorporation of external inputs representing spatial cues during self-localization (obviously necessary for accurate self-localization), self-localization can be accurate even with a weak asymmetric component, and that the weak asymmetric component is sufficient to replay the recently traveled routes (without the external input, which would presumably not be present during sleep). This shows that the two roles hypothesized for hippocampus are not incompatible. REFERENCES S. H. Alyan and B. M. Paul and E. Ellsworth and R. D. White and B. L. McNaughton (1997) Is the hippocampus required for path integration? Society for Neuroscience Abstracts. 23:504. N. J. Cohen and H. Eichenbaum (1993) Memory, Amnesia, and the Hippocampal System, MIT Press, Cambridge, MA. J. O'Keefe and L. Nadel (1978) The Hippocampus as a Cognitive Map, Clarendon Press, Oxford. A. D. Redish and D. S. Touretzky (1997) Cognitive Maps Beyond the Hippocampus, Hippocampus, 7(1):15-35. A. D. Redish (1997) Beyond the Cognitive Map: Contributions to a Computational Neuroscience Theory of Rodent Navigation, PhD Thesis. Carnegie Mellon University, Pittsburgh PA. A. D. Redish and D. S. Touretzky (1998) The role of the hippocampus in solving the {M}orris Water Maze, Neural Computation, 10(1):73-111. A. D. Redish (in press) Beyond the Cognitive Map: From Place Cells to Episodic Memory, MIT Press, Cambridge MA. W. E. Skaggs and B. L. McNaughton (1996) Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience, Science, 271:1870-1873. D. S. Touretzky and A. D. Redish (1996) A theory of rodent navigation based on interacting representations of space, Hippocampus, 6(3):247-270. ----------------------------------------------------- A. David Redish adr at nsma.arizona.edu Post-doc http://www.cs.cmu.edu/~dredish Neural Systems, Memory and Aging, Univ of AZ, Tucson AZ ----------------------------------------------------- From negishi at cns.bu.edu Tue Aug 25 05:11:25 1998 From: negishi at cns.bu.edu (Michiro Negishi) Date: Tue, 25 Aug 1998 05:11:25 -0400 Subject: Connectionist symbolic processing In-Reply-To: <199808241825.LAA15169@lilith.network-b> (michael.j.healy@boeing.com) Message-ID: <199808250911.FAA09262@music.bu.edu> Here are my 5 cents, from the self-organizing camp. On Mon, 10 Aug 1998 Dave_Touretzky at cs.cmu.edu wrote: > The problem, though, was that we > did not have good techniques for dealing with structured information > in distributed form, or for doing tasks that require variable binding. > While it is possible to do these things with a connectionist network, > the result is a complex kludge that, at best, sort of works for small > problems, but offers no distinct advantages over a purely symbolic > implementation. As many have already argued, at least empirically I don't feel that the issue of structured data representation as *the main* obstacle in constructing a model of symbolic processing, although it is an interesting and challenging subject. In my neural model of syntactic analysis and thematic role assignment for instance, I use the following neural fields for representing a word/phrase. (1) A field for representing the word or the head of the phrase. (there is a computational algorithm for determining the head of a phrase) (2) Fields for representing the features of the word/phrase as well as its children in the syntactic tree (or the semantic structure). Features are obtained by the PCA over the context in which the word/which appears. (3) Associator fields for retrieving children and the parent. In plain words, (1) is the lexical information, (2) is the featural information, and (3) is the associative pointer. The resultant representation is similar to RAAM. A key point in the feature extraction in the model is that once the parser begins to combine words into phrases, it begins to collect distributions in terms of the heads of the phrases, which in turn is used in the PCA. The model was trained using a corpus that contains mothers' input to the children (a part of the CHILDES corpus), so it's not a "toy" model, although it's not as good as being able to cope with the Wall Street Journal yet, (I have to crawl before I walk :) which was expectable considering the very strict learning conditions of the model: no initial lexical or syntactic knowledge, no external corrective signals from a teacher. I think it's a virtue rather than a defect that this type of representation does not represent all concepts at a time. In many languages, each word represents only very limited number of concepts at most, although it can also convey many features of itself and its children (eg. in many languages, agreement morphemes attached to a verb encode gender, person, etc. of the subject and objects). Also there are island effects, which shows that production of a clause can have access only to the concept itself and its direct children (and not internal structure below each child). I think that the real challenge is to do a cognitively plausible modeling that sheds a new light to the understanding of language and cognition. That is why I constrain myself to self-organizing networks. As for future direction I agree with Whitney Tabor that application of the fractal theory may be a promising direction. I would be interested to know if some one tried to interpret HPSG or more classical X-bar theory as fractals. Here are some refs on self-organizing models of language (except for the famous ones by Miikkulainen). This line of research is alive, and will kick soon. Ritter, H. and Kohonen, T. (1990). Learning semantotopic maps from context. Proceedings of IJCNN 90, Washington D.C., I. Sholtes, J. C. (1991). Unsupervised context learning in natural language processing. In Proc. IJCNN Seattle 1991. M. Negishi (1995) Grammar learning by a self-organizing network. In Advances in Neural Information Processing Systems 7, 27-35. MIT Press. My unpublished thesis work is accessible from http://cns-web.bu.edu/pub/mnx/negishi.html ----------------------------------------------------- Michiro Negishi ----------------------------------------------------- Dept. of Cognitive & Neural Systems, Boston Univ. 677 Beacon St., Boston, MA 02215 Email: negishi at cns.bu.edu Tel: (617) 353-6741 ----------------------------------------------------- From kdh at anatomy.ucl.ac.uk Tue Aug 25 07:57:51 1998 From: kdh at anatomy.ucl.ac.uk (Ken Harris) Date: Tue, 25 Aug 1998 12:57:51 +0100 Subject: Neural networks and brain function Message-ID: <199808251157.MAA01484@ylem.anat.ucl.ac.uk> I'd like to add something to the debate about neural network modelling and brain function, in particular concerning the resolution of apparently conflicting models. It seems to me that the main contribution of neural networks to this question has been a change of emphasis. Before neural networks were common currency, a neurological model usually consisted of a statement that a particular brain structure was necessary for a particular type of task. For example: "The cerebellum is necessary for motor control" "The hippocampus is necessary for spatial function" "The hippocampus is necessary for episodic memory" After neural networks, we have a different set of analogies. We now make neurological models that ascribe a particular computational function to a brain structure. For example: "The cerebellum performs supervised learning" "The hippocampus functions as an autoassociative memory" By talking about a computational function, rather than a type of task that a brain structure is needed for, a lot of apparent conflict can suddenly be resolved. In the example of the cerebellum, the evidence that the cerebellum is involved in motor control and classical conditioning, and even higher cognitive functions does not seem so contradictory. It is very plausible that a supervised learning network would be useful for all of these functions -- see for example the work of Kawato and Thompson. In the example of the hippocampus, work by Michael Recce and myself has shown how an autoassociative memory can play a role in both episodic memory and spatial function, in particular giving an animal localisation ability by performing pattern completion on partial egocentric maps. For those who might be interested: Recce, M. and Harris, K.D. "Memory for places: A navigational model in support of Marr's theory of hippocampal function" Hippocampus, vol 6, pp. 735-748 (1996) http://www.ncrl.njit.edu/papers/hpc_model.ps.gz ----------------------------------------------- Ken Harris Department of Anatomy and Developmental Biology University College London http://www.anat.ucl.ac.uk/~kdh From ingber at ingber.com Tue Aug 25 09:29:18 1998 From: ingber at ingber.com (Lester Ingber) Date: Tue, 25 Aug 1998 08:29:18 -0500 Subject: Paper: A simple options training model Message-ID: <19980825082918.A29949@ingber.com> The paper markets98_spread.ps.Z [40K] is available at my InterNet archive: %A L. Ingber %T A simple options training model %R LIR-98-2-SOTM %I Lester Ingber Research %C Chicago, IL %D 1998 %O URL http://www.ingber.com/markets98_spread.ps.Z Options pricing can be based on sophisticated stochastic differential equation models. However, many traders, expert in their art of trading, develop their skills and intuitions based on loose analogies to such models and on games designed to tune their trading skills, not unlike the state of affairs in many disciplines. An analysis of one such game reveals some simple but relevant probabilistic insights into the nature of options trading often not discussed in most texts. ======================================================================== Instructions for Retrieval of Code and Reprints Interactively Via WWW The archive can be accessed via WWW path http://www.ingber.com/ http://www.alumni.caltech.edu/~ingber/ where the last address is a mirror homepage for the full archive. Interactively Via Anonymous FTP Code and reprints can be retrieved via anonymous ftp from ftp.ingber.com. Interactively [brackets signify machine prompts]: [your_machine%] ftp ftp.ingber.com [Name (...):] anonymous [Password:] your_e-mail_address [ftp>] binary [ftp>] ls [ftp>] get file_of_interest [ftp>] quit The 00index file contains an index of the other files. Files have the same WWW and FTP paths under the main / directory; e.g., http://www.ingber.com/MISC.DIR/00index_misc and ftp://ftp.ingber.com/MISC.DIR/00index_misc reference the same file. Electronic Mail If you do not have WWW or FTP access, get the Guide to Offline Internet Access, returned by sending an e-mail to mail-server at rtfm.mit.edu with only the words send usenet/news.answers/internet-services/access-via-email in the body of the message. The guide gives information on using e-mail to access just about all InterNet information and documents. Additional Information Limited help assisting people with queries on my codes and papers is available only by electronic mail correspondence. Sorry, I cannot mail out hardcopies of code or papers. Lester ======================================================================== -- /* Lester Ingber Lester Ingber Research * * PO Box 06440 Wacker Dr PO Sears Tower Chicago, IL 60606-0440 * * http://www.ingber.com/ ingber at ingber.com ingber at alumni.caltech.edu */ From oreilly at grey.colorado.edu Tue Aug 25 11:54:10 1998 From: oreilly at grey.colorado.edu (Randall C. O'Reilly) Date: Tue, 25 Aug 1998 09:54:10 -0600 Subject: What have neural networks achieved? Message-ID: <199808251554.JAA15620@grey.colorado.edu> Another angle on the hippocampal story has to do with the phenomenon of catestrophic interference (McCloskey & Cohen, 1989), and the notion that the hippocampus and the cortex are complementary learning systems that each optimize different functional objectives (McClelland, McNaughton, & O'Reilly, 1995). In this case, the neural network approach provides a principled basis for understanding why we have a hippocampus, and what its functional characteristics should be. Interestingly, one of the "sucesses" of neural networks in this case was their dramatic failure in the form of the catestrophic interference phenomenon. This failure tells us something about the limitations of the cortical memory system, and thus, why we might need a hippocampus. - Randy @incollection{McCloskeyCohen89, author = {McCloskey, M. and Cohen, N. J.}, editor = {G. H. Bower}, title = {Catastrophic interference in connectionist networks: {The} sequential learning problem}, booktitle = {The Psychology of Learning and Motivation}, pages = {109-165}, year = 1989, publisher = {Academic Press}, address = {New York}, volume = 24 } @article{McClellandMcNaughtonOReilly95, author = {McClelland, J. L. and McNaughton, B. L. and O'Reilly, R. C.}, title = {Why There are Complementary Learning Systems in the Hippocampus and Neocortex: Insights from the Successes and Failures of Connectionst Models of Learning and Memory}, journal = {Psychological Review}, pages = {419-457}, year = {1995}, volume = {102} } This article has lots of references to the relevant neural network literature. A TR version is available from the following 2 ftp sites: ftp://cnbc.cmu.edu:/pub/pdp.cns/pdp.cns.94.1.ps.Z ftp://grey.colorado.edu/pub/oreilly/tr/pdp.cns.94.1.ps.Z +-----------------------------------------------------------------------------+ | Dr. Randall C. O'Reilly | | | Assistant Professor | | | Department of Psychology | Phone: (303) 492-0054 | | University of Colorado Boulder | Fax: (303) 492-2967 | | Muenzinger D251C | Home: (303) 448-1810 | | Campus Box 345 | email: oreilly at psych.colorado.edu | | Boulder, CO 80309-0345 | www: http://psych.colorado.edu/~oreilly | +-----------------------------------------------------------------------------+ From lazzaro at CS.Berkeley.EDU Tue Aug 25 12:27:58 1998 From: lazzaro at CS.Berkeley.EDU (John Lazzaro) Date: Tue, 25 Aug 1998 09:27:58 -0700 (PDT) Subject: Connectionist symbol processing Message-ID: <199808251627.JAA07316@snap.CS.Berkeley.EDU> > I would suggest that most recurrent neural net architectures > are not fundamentally more 'neural' than hidden Markov models - > think of an HMM as a neural net with second-order weights > and linear activation functions. We presented a continuous-time analog-circuit implementation of a HMM state decoder a few years ago at NIPS -- I've always felt that if you can make a clock-free analog system compute an algorithm well in silicon, its reasonable to expect it can be implemented usefully in biological neurons as well ... Lazzaro, J. P., Wawrzynek J., and Lippmann, R. (1996). A micropower analog VLSI HMM state decoder for wordspotting. In Jordan, M., Mozer, M., and Petsche, T. (eds), {\it Advances in Neural Information Processing Systems 9}. Cambridge, MA: MIT Press. Lazzaro, J., Wawrzynek, J., Lippmann, R. P. (1997). A micropower analog circuit implementation of hidden markov model state decoding. {\it IEEE Journal Solid State Circuits} {\bf 32}:8, 1200--1209. http://www.cs.berkeley.edu/~lazzaro/biblio/decoder.ps.gz --john lazzaro From jlm at cnbc.cmu.edu Tue Aug 25 14:26:33 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Tue, 25 Aug 1998 14:26:33 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808251826.OAA27165@CNBC.CMU.EDU> I haven't been able to read all of the email on connectionists lately and so it is possible that the following is redundant, but it seems to me there is a real success story here. There has been a great deal of connectionist work on the processing of regular and exceptional material, initiated by the Rumelhart-McClelland paper on the past tense. Debate has raged on the subject of the past tense and work there is ongoing, but I won't claim a success story there at this time. What I would like to point to instead is the related topic of single word reading. Sejnowski and Rosenberg's NETTALK first extended connectionist ideas to this issue, and Seidenberg and McClelland went on to show that a connectionist model could account in great detail for the pattern of reaction times found in around 30 studies concerning the effects of regularity, frequency, and lexical neighbors on reading words aloud. This was followed by a resounding critique along the lines of Pinker and Prince's critique of R&M, coming this time from Derrick Besner (and colleagues) and Max Coltheart (and colleagues). Both pointed to the fact that the S&M model didn't do a very good job of reading nonwords, and both claimed that this reflected an in-principal limitation of a connectionist, single mechanism account: To do a good job with both, it was claimed, a dual route system was required. The success story is a paper by Plaut, McClelland, Seidenberg, and Patterson, in which it was shown in fact that a single mechanism, connectionist model can indeed account for human performance in reading both words and nonwords. The model replicated all the S&M findings, and at the same time was able to read non-words as well as human subjects, showing the same types of neighbor-driven responses that human readers show (eg MAVE is sometimes read to rhyme with HAVE instead of SAVE). Of course there are still some loose ends but it is no longer possible to claim that a single-mechanism account cannot capture the basic pattern of word and non-word reading data. The authors of PMSP all believe, I think, that there are semantic as well as phonological sources of influence on word reading, so that the system is, to an extent, a kind of dual-route system. This was in fact articulated in the earlier, SM formulation. This can lead to apparent dissociations in fMRI and effects of brain damage on reading, but the dissociation is fundamentally one of semantic vs phonological processes rather than lexical vs rule-guided processes. For example the phonological system, while sensitive to regularities, nevertheless captures knowledge of specific high-frequency exceptions. -- Jay McClelland From mozer at cs.colorado.edu Tue Aug 25 12:25:29 1998 From: mozer at cs.colorado.edu (Mike Mozer) Date: Tue, 25 Aug 98 10:25:29 -0600 Subject: What have neural networks achieved? In-Reply-To: Your message of Fri, 14 Aug 98 14:07:20 -0800. Message-ID: <199808251625.KAA17844@neuron.cs.colorado.edu> > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? I've been involved with a company, Sensory Inc., that produces low-cost neural-net-based ICs for speech recognition. We have sold several million units and have an 85% market share in dedicated speech recognition chips. The chip has been used in several dozen applications, including: toys, electronic learning aids, automobiles, consumer electronics, home appliances, light switches, telephones, and clocks. Due to cost constraints that limit RAM and processor speed, performing recognition with alternative approaches such as HMMs would not be feasible. The company web page sucks, but better ones are in the works with a listing of current products: www.sensoryinc.com. Don't blame me for the nonsensical jargon in the company literature. Mike Mozer From aminai at ececs.uc.edu Tue Aug 25 14:30:26 1998 From: aminai at ececs.uc.edu (Ali Minai) Date: Tue, 25 Aug 1998 14:30:26 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808251830.OAA03867@holmes.ececs.uc.edu> David Redish writes: In addition to the self-localization role, the hippocampus has been shown to replay recently traveled routes during sleep (Skaggs and McNaughton, 1996). However, the mechanisms that have been proposed to accomplish these two functions require incompatible connection matrices. Self-localization requires a symmetric component and route-replay requires an asymmetric component. We showed (Redish and Touretzky, 1998) that with the incorporation of external inputs representing spatial cues during self-localization (obviously necessary for accurate self-localization), self-localization can be accurate even with a weak asymmetric component, and that the weak asymmetric component is sufficient to replay the recently traveled routes (without the external input, which would presumably not be present during sleep). This shows that the two roles hypothesized for hippocampus are not incompatible. To add to David's very interesting comments on how self-localization and replay of learned sequences are not incompatible, I would point out that the hippocampal system has a variety of recurrent connection pathways at various hierarchical levels (e.g., CA3-CA3, dentate-hilus-dentate, entorhinal cortex-dentate-CA3-CA1-entorhinal cortex, etc.), and a variety of time-scales at which processes occur (e.g., a theta cycle, a gamma cycle, etc.) It is quite possible for functions requiring symmetric and asymmetric connectivities to coexist in the hippocampus if they occur in different subsystems and/or at different time-scales. I often find that apparent conflicts or trade-offs in modeling result from our neglecting hierarchical considerations (in both space and time) and the possibility of multiple modes of operation. There is plenty of evidence for both in the brain, but a lot of neural modeling still focuses on single time-scales, single (or perhaps 2 or 3) modes, and ``compact'' systems. The hippocampal models developed by Redish and Touretzky are especially interesting because they place the hippocampus in a larger, multi-level context with other systems. Others are starting to address issues of temporal hierarchies in models of memory recall, phase precession, etc. Also, on the point of episodic memory vs. cognitive mapping, it is possible to think of frameworks in which the two may appear as aspects (or even parts) of the same, more abstract, functionality. ----------------------------------------------------------------------------- Ali A. Minai Assistant Professor Complex Adaptive Systems Laboratory Department of Electrical & Computer Engineering and Computer Science University of Cincinnati Cincinnati, OH 45221-0030 Phone: (513) 556-4783 Fax: (513) 556-7326 Email: Ali.Minai at uc.edu Internet: http://www.ececs.uc.edu/~aminai/ ----------------------------------------------------------------------------- From jkolen at typhoon.coginst.uwf.edu Tue Aug 25 17:48:00 1998 From: jkolen at typhoon.coginst.uwf.edu (John F. Kolen) Date: Tue, 25 Aug 1998 16:48:00 -0500 Subject: Two Positions Available Message-ID: <9808251648.ZM7453@typhoon.coginst.uwf.edu> The Institute for Human and Machine Cognition (IHMC) at the University of West Florida has immediate openings for a visiting research scientist and a visiting research programmer. The successful visiting research scientist candidate must hold a Ph.D. in computer science (or equivalent qualification) and have a depth of knowledge in neural networks and computational modeling. Current projects include laser marksmen modeling, spectral analysis of inhomogeneous minerals, and image classification. The successful visiting research programmer candidate must hold a M.S. in computer science (or equivalent qualification) and have C++ programming experience. Knowledge of neural networks and computational modeling is expected. Experience with optics, image processing, spectrometry, geology, human performance, or modeling real-world data will be helpful. Both positions are contingent on project funding. Applicants may be asked to obtain a security clearence with the U.S. Department of Defense. Current projects include laser marksmen modeling, spectral analysis of heterogeneous minerals, and classification of geological formations from images. Many IHMC projects involve interdisciplinary team work, and we are looking for persons who enjoy collaboration with others. Currently, interdisciplinary research is underway in the computational and philosophical foundations of AI, computer-mediated communication and collaboration, smart machines in education, knowledge-based systems, multimedia browsers, fuzzy logic, neural networks, software agents, spatial and temporal reasoning, diagnostic systems, cognitive psychology, reasoning under uncertainty and the design of electronic spaces. Salaries will be commensurate with the levels of qualification. The IHMC was founded with legislative support in 1989 as an interdiscipliary research unit. Additionally, IHMC has succeeded in securing substantial extramural support and has established an enviable research and publication record. Visit the IHMC web page at http://www.coginst.uwf.edu for more information. The University of West Florida is situated in a 1000-acre protected nature preserve bordering the Escambia River, and is approximately 14 miles north of the country's finest white sand beaches. New Orleans is 3 hours away by car. Please send vita or resume to Prof. John F. Kolen Institute For Human and Machine Cognition University of West Florida 11000 University Pkwy. Pensacola, FL 32514 -- John F. Kolen voice: (850)474-3075 Assistant Professor fax: (850)474-3023 Dept. of Computer Science University of West Florida Pensacola, FL 32514 From max at currawong.bhs.mq.edu.au Tue Aug 25 22:54:03 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Wed, 26 Aug 1998 12:54:03 +1000 (EST) Subject: What have neural networks achieved? Message-ID: <199808260254.MAA11068@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 1099 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/09d1131e/attachment-0001.ksh From joachim at moon.fit.qut.edu.au Tue Aug 25 23:51:51 1998 From: joachim at moon.fit.qut.edu.au (Joachim Diederich) Date: Wed, 26 Aug 1998 13:51:51 +1000 Subject: Connectionist symbolic processing Message-ID: <199808260351.NAA09637@moon.fit.qut.edu.au.fit.qut.edu.au> As a follow-up to Michael Healy's note on rule-extraction from neural networks, here are two more recent papers on the topic: Maire, F.: A partial order for the M-of-N rule-extraction algorithm. IEEE TRANSACTIONS ON NEURAL NETWORKS, Vol. 8, No .6, pages 1542-1544, November 1997. Tickle, A.B.; Andrews, R.; Golea, M.; Diederich, J.: The truth will come to light: directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE TRANSACTION ON NEURAL NETWORKS. Special Issue on Hybrid Systems. Scheduled for November 1998. We have a limited number of pre-prints available. Joachim Diederich ********************************************************************** Professor Joachim ("Joe") Diederich Director Machine Learning Research Centre (MLRC) Neurocomputing Laboratory / Data Mining Laboratory Queensland University of Technology _--_|\ Box 2434, Brisbane Q 4001 / QUT AUSTRALIA \_.--._/ Phone: +61 7 3864-2143 v Fax: +61 7 3864-1801 E-mail: joachim at fit.qut.edu.au or joachim at icsi.berkeley.edu WEB: http://www.fit.qut.edu.au/~joachim ********************************************************************** From ken at phy.ucsf.EDU Wed Aug 26 01:00:40 1998 From: ken at phy.ucsf.EDU (Ken Miller) Date: Tue, 25 Aug 1998 22:00:40 -0700 (PDT) Subject: function of hippocampus In-Reply-To: <199808251554.JAA15620@grey.colorado.edu> References: <199808251554.JAA15620@grey.colorado.edu> Message-ID: <13795.38520.152296.309141@coltrane.ucsf.edu> With respect to recent postings about models of hippocampus and memory, I'd like to toss in a cautionary note. A recent report (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) finds no deficit in tasks involving visual recognition memory or spatial memory with lesions of hippocampus and amygdala. Instead, deficits in both cases are associated with, and only with, lesion of the overlying rhinal cortex. They mention in the discussion evidence that "has suggested that the hippocampus may be more important for path integration on the basis of self-motion cues than for location memory, per se" (though Redish' recent posting mentions evidence against this from recent experiments of Alyan and McNaughton; I couldn't find a reference in medline). This is the latest in a series of reports along these lines from the Mishkin lab, who did much of the original lesion work that seemed to implicate hippocampus in memory. I'm not in any way an expert on this literature -- only a very distant observer -- but I worry that, based on lesion studies that also involved lesions of overlying cortex, both the neuroscience and connectionists communities may have jumped to a wrong conclusion that the hippocampus has a special role in episodic and/or spatial memory. I'd be interested to know if there's still good reason to believe in such a role ... Ken Kenneth D. Miller telephone: (415) 476-8217 Dept. of Physiology fax: (415) 476-4929 UCSF internet: ken at phy.ucsf.edu 513 Parnassus www: http://www.keck.ucsf.edu/~ken San Francisco, CA 94143-0444 From jlm at cnbc.cmu.edu Wed Aug 26 01:37:54 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Wed, 26 Aug 1998 01:37:54 -0400 (EDT) Subject: What have neural networks achieved? Message-ID: <199808260537.BAA07862@CNBC.CMU.EDU> Max Coltheart writes: > Randy O'Reilly said: > > Interestingly, one of the "sucesses" of neural networks in this case > was their dramatic failure in the form of the catestrophic > interference phenomenon. This failure tells us something about the > limitations of the cortical memory system, and thus, why we might need > a hippocampus. > > Think about the structure of this argument for a moment. It runs thus: > > 1. Neural networks suffer from catastrophic interference. > 2. Therefore the cortical memory system suffers from catastrophic > interference. > 3. That's why we might need a hippocampus. > > Is everyone happy with the idea that (1) implies (2)? Randy may not have provided quite a full enough description of the observations we made in the McClelland, McNaughton and O'Reilly article concerning what we called 'Complementary Learning Systems' in hippocampus and neocortex. The argument is quite a bit richer than Max's comment suggests, but I will endeavor to summarize it (for full justification and demonstration simulations, see the paper). The arguement, based on the successes as well as the failures of connectionist models of learning and memory, was this: The discovery of the structure present in large ensembles of events and experiences, such as e.g., the structure present in the relations between spelling and sound, requires what we called 'interleaved learning' --- learning in which the connection weights are adapted gradually so that the overall structure present in the ensemble can guide the learning process. It also requires the use of a componential coding scheme, which is essential for good generalization (this theme also appears in the Plaut et al paper, mentioned in my previous post in this discussion). We claimed the neocortex was specialized for structure-sensitive learning, and we observed that neural networks that exhibit this form of learning WOULD exhibit catastrophic interference IF forced to learn quickly, either by turning up the learning rate or by massive repetition of a very small and thus necessarily non-representative sample of training cases. Basically, what's simply happening is that the network is learning the non-representative structure present in the sample, at the expense of whatever it might previously have learned. Max and others might be interested to know that cortical memory systems have been shown to suffer from catestrophic-interference like effects. Massive repetition of a couple of tactile stimuli spanning several fingers can destroy the topographic map in somatosensory cortex (this is research from Merzenich's group). Generally, however, the cortex avoids catestrophic interference by using a relatively small learning rate, so that, in the normal course of events, the weights will reflect a sufficient sample of the environment. To allow rapid learning of the contents of a particular experience, the arguement goes, a second learning system, complementary to the first, is needed; such a system has a higher learning rate and recodes inputs using what we call 'sparse, random conjunctive coding' to minimize interference (while simultaneously reducing the adequacy of generalization). These characteristics are just the ones that appear to characterize the hippocampal system: it is the part of the brain known to be crucial for the rapid learning of the contents of specific experiences; it is massively plastic; and neuronal recording studies indicate that it does indeed use sparse, random conjunctive coding. Citations for the relevant articles follow. -- Jay McClelland -------------------------------------- @Article{McClellandMcNaughtonOReilly95, author = "McClelland, J. L. and McNaughton, B. L. and O'Reilly, R. C." , year = "1995" , title = "Why there are complementary learning systems in the hippocampus and neocortex: {Insights} from the successes and failures of connectionist models of learning and memory" , journal= "Psychological Review", volume = "102", pages = "419-457" } @Article{OReillyMcClelland94, author = "O'Reilly, R. C. and McClelland, J. L.", title = "Hippocampal conjunctive encoding, storage, and recall: {Avoiding} a tradeoff", journal = "Hippocampus", year = 1994, volume = 4, pages = "661-682" } @Article{McClellandGoddard96, author = "McClelland, J. L. and Goddard, N. H." , year = "1996" , title = "Considerations arising from a complementary learning systems perspective on hippocampus and neocortex" , journal = "Hippocampus" , volume = "6" , pages = "654--665" } @Article{PlautlETAL96, author = "Plaut, D. C. and McClelland, J. L. and Seidenberg, M. S. and Patterson, K. E.", title = "Understanding Normal and Impaired Word Reading: {Computational} Principles in Quasi-Regular Domains", journal = "Psychological Review", volume = "103", pages = "56-115", year = "1996" } Sorry, I'm not sure the correct citation for the Merzenich finding mentioned. It may be: @Article{WangMerzenichSameshimaJenkins95, author = {Wang, X. and Merzenich, M. M. and Sameshima, K. and Jenkins, W. M.}, title = {Remodelling of hand representation in adult cortex determined by timing of tactile stimulation}, journal = {Nature}, year = {1995}, volume = {378}, pages = {71-75} } From arbib at pollux.usc.edu Wed Aug 26 02:58:00 1998 From: arbib at pollux.usc.edu (Michael A. Arbib) Date: Tue, 25 Aug 1998 22:58:00 -0800 Subject: Neural networks and brain function In-Reply-To: <199808251157.MAA01484@ylem.anat.ucl.ac.uk> Message-ID: Dear Dr. Harris: Thank you very much for your very helpful remarks. I would like to agree and disagree!! You write: >After neural networks, we have a different set of analogies. >We now make neurological models that ascribe a particular >computational function to a brain structure. For example: > > "The cerebellum performs supervised learning" > "The hippocampus functions as an autoassociative memory" > >By talking about a computational function, rather than a >type of task that a brain structure is needed for, a >lot of apparent conflict can suddenly be resolved. > >In the example of the cerebellum, the evidence that the cerebellum >is involved in motor control and classical conditioning, and >even higher cognitive functions does not seem so contradictory. >It is very plausible that a supervised learning network >would be useful for all of these functions -- see for example >the work of Kawato and Thompson. I agree - and in fact noted that the integration of work on the role of cerebellum in motor control and classical conditioning was a specific target of work at USC. You might reply: "Why is it a target? The problem is solved! The cerebellum does supervised learning! What more needs to be said?" And this is where I disagree - the observation that we might use supervised learning rather than Hebbian or reinforcement learning is only part of the issue under investigation. There are 2 complementary concerns: a) Many many functions can exploit supervised learning. Thus to show that A and B both use supervised learning in no way guarantees that the cerebellum carries out both of them - though I very much accept your point that the observation that they exploit the same learning mechanism seems a very useful step in that direction. b) Again, supervised learning can be realized in simple networks. We still have many questions to answer (we do have partial answers) as to why the cerebellar cortex has the structure it has, what might be the specific advantages of the actual mix of LTD and "re-potentiation" at the parallel-fiber-Purkinje cell synapse, and what is the relation between cerebellar cortex, cerebellar nucleus and inferior olive in a "generic" microcomplex. Even when we have answered that, we still have to ask whether - for posited function A or B - there is a set of cerebellar microcomplexes appropriately wired up to other brain regions to realize the supervised adaptation of that function. >In the example of the hippocampus, work by Michael Recce and >myself has shown how an autoassociative memory can play a role >in both episodic memory and spatial function, in particular >giving an animal localisation ability by performing pattern >completion on partial egocentric maps. I very much look forward to reading your paper! Nonetheless, the above observations also apply to autoassociative memory - this is not limited to HC. Conversely, recent work of ours suggests that, to fully understand its role in navigation, we must embed HC in a larger system including parietal cortex and other regions. It seems unlikely that the same pattern of embedding will account for episodic memory. So ... thank you for a stimulating general perspective. I look forward to other messages on brain modeling - both those adding to the stock of general principles, and those showing how the particularities of a system (LTP/LTD, neural morphology, embedding within larger networks of networks) account for its diverse functions. ********************************* Michael A. Arbib USC Brain Project University of Southern California Los Angeles, CA 90089-2520, USA arbib at pollux.usc.edu (213) 740-9220; Fax: 213-740-5687 http://www-hbp.usc.edu/HBP/ From mherrma at gwdg.de Wed Aug 26 05:37:14 1998 From: mherrma at gwdg.de (Michael Herrmann) Date: Wed, 26 Aug 1998 11:37:14 +0200 Subject: Staff Scientist Position Message-ID: <35E3D74A.41C6@gwdg.de> Staff Scientist Position in Theoretical Neuroscience Max-Planck-Institut fuer Stroemungsforschung Nonlinear Dynamics Group Goettingen, Germany The institute invites applications for a staff scientist position in its theory group. Candidates are expected to have excellent academic qualifications and a proven record of research in one or more of the following fields: computational neuroscience, theoretical brain research, systems neurobiology, adaptive behavior -- in addition to a good background in theoretical physics. Research at the institute focuses on mesoscopic systems in physics and biology and on computational neuroscience. The successful candidate is expected to carry out independent research in the field of her/his specialization and to collaborate with post-docs and graduate students. The group is closely affiliated with the University of Goettingen, where a "Habilitation" (secondary doctoral degree) may be pursued. For a detailed description of the research projects and resources of our group and for information about the city of Goettingen, please visit our WWW homepage at http://www.chaos.gwdg.de. The appointment will be for a fixed term of up to five years starting from September 1998 or later. Salary is according to the German BAT IIa/Ib bracket. The institute encourages applications from all qualified candidates, particularly women and persons with disabilities. Please send your application including CV, publication list (including up to three selected reprints), and a statement of your scientific interests and research plans as soon as possible to: Prof. Dr. Theo Geisel MPI fuer Stroemungsforschung Bunsenstrasse 10 D-37073 Goettingen, Germany From bryan at cog-tech.com Wed Aug 26 09:40:08 1998 From: bryan at cog-tech.com (Bryan B. Thompson) Date: Wed, 26 Aug 1998 09:40:08 -0400 Subject: What have neural networks achieved? In-Reply-To: <199808260254.MAA11068@currawong.bhs.mq.edu.au> (message from Max Coltheart on Wed, 26 Aug 1998 12:54:03 +1000 (EST)) Message-ID: <199808261340.JAA20766@cti2.cog-tech.com> Max, Think about the structure of this argument for a moment. It runs thus: 1. Neural networks suffer from catastrophic interference. 2. Therefore the cortical memory system suffers from catastrophic interference. 3. That's why we might need a hippocampus. Is everyone happy with the idea that (1) implies (2)? Max max at currawong.bhs.mq.edu.au I am not happy with the conclusion (1), above. Catastrophic interference is a function of the global quality of the weights involved in the network. More local networks are, of necessity, less prone to such interference as less overlapping subsets of the weights are used to maps the transformation from input to output space. Modifying some weights may have *no* effect on some other predictions. In the extreme case of table lookups, it is clear that catastropic interference completely disappears (along with most options for generalization, etc.:) In many ways, it seems that this statement is true for supervised learning networks in which weights are more global than not. Other, more practical counter examples would include (differentiable) CMACs and radial basis function networks. A related property is the degree to which a learned structure ossifies in a network, such that the network is, more or less, unable to respond to a changing environment. This is related to being stuck in local minima, but the solution may even have been optimal for the initial environmental conditions. Networks, or systems of networks, which permit multiple explanatory theories to be explored at once are less susceptible to both of these pitfalls (catastrophic interference and ossification, or loss of plasticity). The support of "multiple explanatory theories" opens the door to area which does not appear to receive enough attention: neural architectures which perform many-to-many mappings vs learn to estimate a weighted average of their target observations. For example, you are drawing samples from a jar of colored marbles, or prediction of the part-of- speech to follow in a sentence. Making the wrong prediction is not an error, it should just lead to updating the probability distribution over the possible outcomes. Averaging the representations and predicting, e.g., green(.8) + red(.2) => greed(1.0), is the error. So, are there good reasons for believing that "cortical memory system"s (a) exhibit these pitfalls (catastrophic interference, ossification or loss of plasticity, and averaging target observations). or (b) utilize architectures which minimize such effects. Clearly, the answer will be a mixture, but I believe that these effects are all minimized in our adaptive behaviors. -- bryan thompson bryan at cog-tech.com From cjcb at molson.ho.lucent.com Wed Aug 26 09:41:05 1998 From: cjcb at molson.ho.lucent.com (Chris Burges) Date: Wed, 26 Aug 1998 09:41:05 -0400 Subject: neural network success story Message-ID: <199808261341.JAA28643@cottontail.lucent.com> > b) What are the "big success stories" (i.e., of the kind the general public > could understand) for neural networks contributing to the construction of > "artificial" brains, i.e., successfully fielded applications of NN hardware > and software that have had a major commercial or other impact? Lucent Technologies sells the Lucent Courtesy Amount Reader (LCAR) to read financial amounts on US checks. This software is currently installed in a number of banks and is processing several million checks per day. LCAR reads machine print and handwritten amounts on both personal and business checks. The amount recognition algorithms are based on feed-forward convolutional neural networks. The basic ideas underlying the graph-based approach to both segmentation and neural network training can be found in: C.J.C. Burges, O. Matan, Y. Le Cun, J.S. Denker, L.D. Jackel, C.E. Stenard, C.R. Nohl, J.I. Ben, "Shortest Path Segmentation: A Method For Training a Neural Network to Recognize Character Strings", IJCNN Conference Proceedings Vol 3, pp. 165-172, 1992 J. Denker, C.J.C. Burges, "Image Segmentation and Recognition", in The Mathematics of Generalization: Proceedings of the SFI/CNLS Workshop on Formal Approaches to Supervised Learning, Addison Wesley, ISBN 0-201-40985-2, 1994 C.J.C. Burges, J.I. Ben, J.S. Denker, Y. LeCun and C.R. Nohl, "Off Line Recognition of Handwritten Postal Words Using Neural Networks", International Journal of Pattern Recognition and Artificial Intelligence, Vol. 7, Number 4, p. 689, 1993; also in Advances in Pattern Recognition Systems Using Neural Network Technologies, Series in Machine Perception and Artificial Intelligence, Volume 7, Edited by I. Guyon and P.S.P Wang, World Scientific, 1993. More recently, the graph-based approach has been significantly extended to allow end-to-end training of large, complex systems. For this see: Leon Bottou, Yoshua Bengio, Yann Le Cun, "Global Training of Document Processing Systems using Graph Transformer Networks", In Proceedings of Computer Vision and Pattern Recognition, Puerto Rico, IEEE, 1997. An extended paper discussing this will appear soon in Transactions of IEEE: Yann Le Cun, Leon Bottou, Yoshua Bengio, and Patrick Haffner, "Gradient Based Learning Applied to Document Recognition", to appear in Proceedings of IEEE. The underlying neural networks used by the system are the convolutional feed forward "LeNet" series. These are pretty well known by now. One place to go for a description, and a comparison with other algorithms, is: Yann Le Cun, Lawrence D. Jackel, Leon Bottou, Corinna Cortes, John S. Denker, Harris Drucker, Isabelle Guyon, Urs A. Muller, Eduard Sackinger, Patrice Simard, and Vladimir N. Vapnik. Learning algorithms for classification: A comparison on handwritten digit recognition. In J. H. Oh, C. Kwon, and S. Cho, editors, Neural Networks: The Statistical Mechanics Perspective, pages 261-276. World Scientific, 1995. There is quite a bit more to the LCAR system than is represented by these refs. (e.g. how to read handwritten fractional amounts), but those methods are not yet written up anywhere. However you can find a little more information on the LCAR system itself at http://www.lucent.dk/ssg/html/lcar.html. - Chris Burges burges at lucent.com From oreilly at grey.colorado.edu Wed Aug 26 13:08:40 1998 From: oreilly at grey.colorado.edu (Randall C. O'Reilly) Date: Wed, 26 Aug 1998 11:08:40 -0600 Subject: function of hippocampus In-Reply-To: <13795.38520.152296.309141@coltrane.ucsf.edu> (message from Ken Miller on Tue, 25 Aug 1998 22:00:40 -0700 (PDT)) References: <199808251554.JAA15620@grey.colorado.edu> <13795.38520.152296.309141@coltrane.ucsf.edu> Message-ID: <199808261708.LAA16289@grey.colorado.edu> Ken Miller writes: > With respect to recent postings about models of hippocampus and > memory, I'd like to toss in a cautionary note. A recent report > (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and > Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala > and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) > finds no deficit in tasks involving visual recognition memory or > spatial memory with lesions of hippocampus and amygdala. Instead, > deficits in both cases are associated with, and only with, lesion of > the overlying rhinal cortex. One general take on the division of labor between the rhinal cortex and the hippocampus proper is that the rhinal cortex can subserve "familiarity" based tasks (e.g., recognition), and the hippocampus is only necessary for actually recalling information. Familiarity might be subserved by priming-like, small weight changes that shift the relative balance of recently-activated representations. In contrast, the hippocampus proper seems particularly well suited for doing pattern completion, where a cue triggers the recall (completion) of a previously stored pattern. This requires storing a conjunctive representation that binds together all the elements of an event (so as to be recalled with a partial cue). Both familiarity and recollection can contribute to recognition memory, but having just familiarity can presumably get you pretty far. There is a growing literature that generally supports this distinction, some refs included below. I can't comment as to how this relates to spatial navigation. - Randy @article{Yonelinas97, author = {Yonelinas, A. P.}, title = {Recognition memory {ROCs} for item and associative information: The contribution of recollection and familiarity.}, journal = {Memory and Cognition}, pages = {747-763}, year = {1997}, volume = {25} } Aggleton & Brown, in press. Episodic Memory, Amnesia, and the Hippocampal-Anterior Thalamic Axis. Behavioral Brain Sciences, (penultimate draft available from BBS ftp site). @incollection{OReillyNormanMcClelland98, author = {O'Reilly, R. C. and Norman, K. A. and McClelland, J. L.}, editor = {Jordan, M. I. and Kearns, M. J. and Solla, S. A.}, title = {A Hippocampal Model of Recognition Memory}, booktitle = {Advances in Neural Information Processing Systems 10}, year = {1998}, publisher = {MIT Press}, address = {Cambridge, MA} } this is available as: ftp://grey.colorado.edu/pub/oreilly/papers/hip_rm_nips.ps +-----------------------------------------------------------------------------+ | Dr. Randall C. O'Reilly | | | Assistant Professor | | | Department of Psychology | Phone: (303) 492-0054 | | University of Colorado Boulder | Fax: (303) 492-2967 | | Muenzinger D251C | Home: (303) 448-1810 | | Campus Box 345 | email: oreilly at psych.colorado.edu | | Boulder, CO 80309-0345 | www: http://psych.colorado.edu/~oreilly | +-----------------------------------------------------------------------------+ From max at currawong.bhs.mq.edu.au Wed Aug 26 19:07:42 1998 From: max at currawong.bhs.mq.edu.au (Max Coltheart) Date: Thu, 27 Aug 1998 09:07:42 +1000 (EST) Subject: What have neural networks achieved? Message-ID: <199808262307.JAA13303@currawong.bhs.mq.edu.au> A non-text attachment was scrubbed... Name: not available Type: text Size: 5004 bytes Desc: not available Url : https://mailman.srv.cs.cmu.edu/mailman/private/connectionists/attachments/00000000/782f2d7c/attachment-0001.ksh From horn at neuron.tau.ac.il Wed Aug 26 19:55:54 1998 From: horn at neuron.tau.ac.il (David Horn) Date: Thu, 27 Aug 1998 02:55:54 +0300 (IDT) Subject: What have neural networks achieved? Message-ID: Neuronal Regulation: A Complementary Mechanism to Hebbian Learning ------------------------------------------------------------------ We would like to point out the use of neural networks not for modelling a particular nucleus or cortical area, but for introducing and testing a general principle of information processing in the brain. Hebb can be viewed as the founder of this approach, suggesting a principle of how memories can be encoded in neuronal circuits. His ideas are still being tested today, and with the advent of knowledge regarding short term and long term synaptic plasticity, an understanding of learning and memory seems to be imminent. Yet there are still many open questions. One of the most interesting ones is the maintenance of memories over long times in the face of continuous synaptic metabolic turnover. We have recently studied this question theoretically (1) and concluded that to achieve long-term memory there has to exist a Neuronal Regulation mechanism with the following properties: 1. Multiplicative modification of all excitatory synapses projecting on a pyramidal neuron by a common, joint factor. 2. The magnitude of this regulatory neuronal factor changes inversely with respect to the neuron's post-synaptic potential, or the neuron's firing activity. In contrast to Hebbian changes, the synaptic modifications do not occur on the individual synaptic level as a function of the correlation between the firing of its pre and post-synaptic neurons, but take place in unison over all the synapses projecting on a neuron, as function of its membrane potential. In a series of very elegant slice experiments in rat, Turrigiano et al (2) have recently observed such phenomena. They find activity dependent changes in AMPA mediated mini EPSCs of pyramidal neurons. The regulatory process that they have observed has the features listed above. We believe that this newly observed mechanism serves as a complement to Hebbian synaptic learning. In our studies we found that it regulates basins of attraction of memories, thus preventing formation of pathologic attractors. Neuronal regulation may hence play a uniquely important role in preventing clinical and cognitive abnormalities like schizophrenic positive symptoms, that may result from the formation of such pathologic attractors (3,4). Activity-dependent neural regulatory processes have been previously observed experimentally (5) and studied theoretically (6,7). We were led to the problem of memory maintenance after first studying a neural model of Alzheimer's disease, where the late stages of regulatory processes, that are hypothesized to maintain cognitive function during normal aging, seem to fail (8). References: 1. D. Horn, N. Levy and E. Ruppin: Memory maintenance via neuronal regulation. Neural Computation, 10, 1-18 (1998). 2. G.G. Turrigiano, K. R. Leslie, N. S. Desai, L. C. Rutherford and S. B. Nelson: Activity-dependent scaling of quantal amplitude in neocortical neurons. Nature, 391, 892-895 (1998). 3. D. Horn and E. Ruppin: Compensatory mechanisms in an attractor neural network model of Schizophrenia. Neural Computation 7, 1494-1517 (1994). 4. E. Ruppin, J. Reggia and D. Horn: A neural model of positive schizophrenic symptoms. Schizophrenia Bulletin 22, 105-123 (1996). 5. G. LeMasson, E. Marder and L. F. Abbott: Activity-dependent regulation of conductances in model neurons. Science, 259, 1915-1917 (1993). 6. L. F. Abbott and G. LeMasson: Analysis of neuron models with dynamically regulated conductances. Neural Computation, 5, 823-842 (1993). 7. A. van Ooyen: Activity-dependent neural network development. Network, 5, 401-423 (1994). 8. D. Horn, N. Levy and E. Ruppin: Neuronal-based synaptic compensation: A computational study in Alzheimer's disease. Neural Computation 8, 1227-1243 (1996). David Horn Eytan Ruppin horn at neuron.tau.ac.il ruppin at math.tau.ac.il ---------------------------------------------------------------------------- Prof. David Horn horn at neuron.tau.ac.il School of Physics and Astronomy http://neuron.tau.ac.il/~horn Tel Aviv University Tel: ++972-3-642-9305, 640-7377 Tel Aviv 69978, Israel. Fax: ++972-3-640-7932 From terry at salk.edu Wed Aug 26 21:56:08 1998 From: terry at salk.edu (Terry Sejnowski) Date: Wed, 26 Aug 1998 18:56:08 -0700 (PDT) Subject: What have neural networks achieved? Message-ID: <199808270156.SAA06357@helmholtz.salk.edu> A footnote to Jay's last post on interference: One of the best established facts about memory is the spacing effect -- long term retention is much better for a wide variety of tasks and materials if the training is spaced in time rather than massed (cramming may help for the test tomorrow, but you won't remember it next year). Charlie Rosenberg showed that NETtalk exhibits a robust spacing effect when learning a new set of words. The explanation is similar to the one Jay has provided: You don't want to find the nearest place in weight space that codes the new words, but the nearest location in weight space that codes the old words and the new ones. Rosenberg, C. R. and Sejnowski, T. J., The effects of distributed vs massed practice on NETtalk, a massively-parallel network that learns to read aloud, Proceedings 8th Annual Conference of the Cognitive Science Society, Amherst, MA (August 1986). Whether the hippocampus is "replaying" recent experiences to the neocortex during sleep is an open question, though there is some evidence for this in rats. For further discussion see: Sejnowski, T. J., Sleep and memory, Current Biology 5, 832-834 (1995). There are high-amplitude thalamocortical rhythms that occur during sleep whose function is unknown. The mechanisms underlying these slow rhythms have been studied at the biophysical level, and incorporated into network models, which would qualify for a "success" story for understanding large-scale dynamical properties of brain systems: Steriade, M., McCormick, D. A., Sejnowski, T. J., Thalamocortical oscillations in the sleeping and aroused brain, Science 262, 679-685 (1993). Destexhe, A. and Sejnowski, T. J., Synchronized oscillations in thalamic networks: Insights from modeling studies, In: M. Steriade, E. G. Jones and D. A. McCormick (Eds.) Thalamus, Elsevier, pp 331-371 (1997). These models are a first step toward understanding the function of these oscillations, and perhaps someday the function of sleep, which remains a deep mystery. Regarding the comment by Ken Miller, the regions of the cortex that surround the hippocampus, including the entorhinal cortex, the perirhinal cortex and the parahippocampal cortex are staging areas for converging inputs to the hippocampus. Stuart Zola has shown that the severity of amnesia folowing lesions of these areas in monkeys is greater as more surrounding cortical areas are included in the lesion. The famous case of HM had surgical removal of the tmeporal lobe which included the areas surrounding the hippocampus. The view in the field is no longer to think of the hippocampus as the primary site but as part of a memory system in reciprocal interaction with these cortical areas. Other brain areas including frontal cortex and the cerebellum also are involved: Tulving E; Markowitsch HJ. Memory beyond the hippocampus. Current Opinion in Neurobiology, 1997 Apr, 7(2):209-16. Functional magentic resonance imaging is a powerful new tool for measuring activity and has been applied to memory systems -- see the latest results in the 21 August issue of Science. Terry ----- From aminai at ececs.uc.edu Thu Aug 27 01:18:27 1998 From: aminai at ececs.uc.edu (Ali Minai) Date: Thu, 27 Aug 1998 01:18:27 -0400 (EDT) Subject: function of hippocampus Message-ID: <199808270518.BAA06571@holmes.ececs.uc.edu> Ken Miller writes: With respect to recent postings about models of hippocampus and memory, I'd like to toss in a cautionary note. A recent report (Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) finds no deficit in tasks involving visual recognition memory or spatial memory with lesions of hippocampus and amygdala.... I'm not in any way an expert on this literature -- only a very distant observer -- but I worry that, based on lesion studies that also involved lesions of overlying cortex, both the neuroscience and connectionists communities may have jumped to a wrong conclusion that the hippocampus has a special role in episodic and/or spatial memory. I'd be interested to know if there's still good reason to believe in such a role ... The concern expressed here is certainly warranted --- and not just on theories of hippocampal function. I think a lot of us are increasingly skeptical about theories of a simple, unitary function for the hippocampus. Indeed, different people often mean different things when they use the term ``hippocampus''. That having been said, I do think (and others can marshall the evidence better than I can) that a preponderance of evidence favors a hippocampal involvement in episodic memory and, at least in rodents, spatial cognition. The data from lobotomy patients such as H.M. and the extensive series of results from Squire's group provide convincing evidence that the hippocampus and its surrounding regions are involved in certain types of memory. Whether this role is central or peripheral (but important) is not clear, and I agree that most theories about the CA3 or the hippocampus as the site of associative storage --- temporary or otherwise --- are driven primarily by the intriguing structural analogies with recurrent neural networks. However, that is not necessarily a bad way to proceed. A major problem with experimental neuroscience is its tendency to produce oceans of data based on very narrowly focused experiments. Addressing this with formal large-scale theories provides a valuable --- if imperfect --- means of thinking about the big picture, and we need more of such theorizing. The issue of hippocampal involvement in spatial cognition in rodents is based on a very large body of lesion studies, but is given overwhelming credibility, in my opinion, by the undeniable existence of place cells and head-direction cells. The systematic study of sensory, behavioral, mnemonic, and other correlates of this organized cell activity provides convincing evidence that the hippocampus ``knows'' a great deal about the animal's spatial environment, is very sensitive to it, and responds robustly to disruptions of landmarks, etc. Recent reports on reconstructing physical location from place cell activity (Wilson and McNaughton, Science, 1993; Zhang et al., J. Neurophysiol., 1998; Brown et al., J. Neurosci, in press) clearly show that very accurate information about an animal's spatial position is available in the hippocampus. It must be used for something. Similar results are available about head direction cells. I do not think we really understand what role the rodent hippocampus plays in spatial cognition, but it is hard to dispute that it plays some --- possibly many --- important roles. I think that, as theories about hippocampal function begin to place the hippocampus in the larger context of other interconnected systems (e.g., in the work of Redish and Touretzky), we will move away from the urge to say, ``Here! This is what the hippocampus does'' and towards the recognition that it is probably an important part in a larger system for spatial cognition. Indeed, it is quite possible that, when we do arrive at a satisfactory explanation of hippocampal function, we will have no name for it in our current vocabulary (though I have no doubt that psychologists will invent one:-). Finally, one issue that is particularly relevant to hippocampal theories is the possibility that the categories of memory (e.g., episodic, declarative, etc.) or task (DNMS, spatial memory, working memory, etc.) that we use in our theorizing may not match up with the categories relevant to actual hippocampal functionality. Perhaps we are trying to build a science of chemistry based on air, water, fire, and earth. The good news is that the chemistry experiment was eventually successful and we did find our way to the correct elemental categories. Ali ----------------------------------------------------------------------------- Ali A. Minai Assistant Professor Complex Adaptive Systems Laboratory Department of Electrical & Computer Engineering and Computer Science University of Cincinnati Cincinnati, OH 45221-0030 Phone: (513) 556-4783 Fax: (513) 556-7326 Email: Ali.Minai at uc.edu Internet: http://www.ececs.uc.edu/~aminai/ From dwang at cis.ohio-state.edu Wed Aug 26 13:43:38 1998 From: dwang at cis.ohio-state.edu (DeLiang Wang) Date: Wed, 26 Aug 1998 13:43:38 -0400 Subject: What have neural networks achieved? Message-ID: <35E4494A.5C1@cis.ohio-state.edu> On the neuroscience front, a major success story of neural networks is the temporal (oscillatory) correlation theory, proposed and systematically advocated by Christoph von der Malsburg of USC and Univ. of Bochum. His pioneering theory was first described in 1981 (see below) in perhaps the most quoted technical report in neural networks (for a brief but earlier speculation along this line see Milner, 1974). His theory and prediction led to the two first confirmative reports by Echorn et al. (1988) and Gray et al. (1989). Since then numerous experiments have been conducted that confirm the theory (not without some controversy), including many papers published in Nature and Science (see Phillips and Singer, 1997, for a recent review). From rao at salk.edu Thu Aug 27 03:34:58 1998 From: rao at salk.edu (Rajesh Rao) Date: Thu, 27 Aug 1998 00:34:58 -0700 (PDT) Subject: Neural networks and brain function In-Reply-To: Message-ID: <199808270734.AAA23721@dale.salk.edu> > This failure tells us something about the limitations of the cortical > memory system, and thus, why we might need a hippocampus. Speaking of the cortex, some promising results have been obtained in recent years with regard to explaining cortical receptive field properties and interpreting cortical feedback/lateral connections using statistical principles such as maximum likelihood and Bayesian estimation. This line of research goes back to the early ideas of redundancy reduction and predictive coding advocated by Attneave (1954), MacKay (1956), and Barlow (1961). More recent incarnations of these ideas have been in the form of networks that attempt to learn sparse efficient codes (Olshausen and Field, Lewicki and Olshausen), networks that aim to maximize statistical independence of outputs (approaches of Bell and Sejnowski, van Hateren and Ruderman using ICA - http://www.cnl.salk.edu/~tewon/ica_cnl.html), networks that try to learn translation-invariant codes (Dana Ballard and myself), and networks that exploit biological constraints such as rectification for efficient coding (Lee and Seung, Hinton and Ghahramani, Dana Ballard and myself). Application of these algorithms to natural images produce spatial and spatiotemporal receptive field properties qualitatively similar to those observed in the visual cortex (the important and related line of research on correlation-based models of development by Ken Miller and others has already been mentioned in this thread). In the realm of hierarchical models, the early proposal of MacKay and more recently Mumford, ascribing to feedback connections the role of predicting or anticipating inputs, has been formalized in terms of learning generative models of input signals, the idea being that the feedback pathways might represent a learned statistical model of how the inputs are being generated ("synthesis" as opposed to "analysis" in the feedforward pathways). Examples include the work of Dayan, Hinton, Neal and Zemel (Helmholtz machine), Kawato, Hayakawa and Inui (forward-inverse optics model), Dana Ballard and myself (extended Kalman filter model), and related work by people such as Pece, Softky, Ullman, and others (I apologize if I inadvertently missed someone - please post a reply to add to this list). The work on hierarchical models is also closely related to the algorithms in the previous paragraph in that both rely on the idea of generative models, the differences being in the type of constraints imposed and the definition of statistical efficiency used. Although the results obtained thus far have been encouraging, the precise details regarding the neurobiological implementation of these algorithms in the cortex is far from clear. There is also a need for models that allow efficient learning of non-linear hierarchical generative models while at the same time respecting cortical neuroanatomical constraints. This gives me the excuse to advertise (somewhat shamelessly) a post-NIPS workshop on statistical theories of cortical function: the web page http://www.cnl.salk.edu/~rao/workshop.html contains more details and links to the web pages of some of the people pursuing this line of research. References: @article (Attneave54, author = "F. Attneave" , title = "Some informational aspects of visual perception", journal = "Psychological Review" , volume = "61" , number = "3" , year = "1954" , pages = "183-193" ) @incollection{MacKay56, author = "D. M. MacKay", title = "The epistemological problem for automata", editors = "C. E. Shannon and J. McCarthy", booktitle = "Automata Studies", pages = "235-251", publisher = "Princeton, NJ: Princeton University Press", year = "1956" } @incollection{Barlow61, author = "H. B. Barlow", title = "Possible principles underlying the transformation of sensory messages", editor = "W. A. Rosenblith", booktitle = "Sensory Communication", pages = "217-234", publisher = "Cambridge, MA: MIT Press", year = "1961" } (Other references to work mentioned above can be obtained from the web pages of the researchers - see the workshop page given above for some useful links). --- Rajesh P.N. Rao, Ph.D. Internet: rao at salk.edu The Salk Institute, CNL & Sloan Ctr VOX: 619-453-4100 x1215 10010 N. Torrey Pines Road FAX: 619-587-0417 La Jolla, CA 92037 WWW: http://www.cnl.salk.edu/~rao/ From jlm at cnbc.cmu.edu Thu Aug 27 07:57:45 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Thu, 27 Aug 1998 07:57:45 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: <199808262307.JAA13303@currawong.bhs.mq.edu.au> (message from Max Coltheart on Thu, 27 Aug 1998 09:07:42 +1000 (EST)) Message-ID: <199808271157.HAA01080@CNBC.CMU.EDU> Max Coltheart writes: To account for surface dyslexia (reading YACHT as "yatched", they stopped the training of the network before it had successfully learned low-frequency exception words such as this one, and postulated that in the normal reader such words can only be read aloud with input to phonology from a second system (semantics). Two problems with this: (a) it involves giving up the very thing that Jay says was an achievement: a single mechanism that can read aloud all exception words plus nonwords and (b) it predicts that anyone with severe semantic impairment will also show surface dyslexic reading, which is not the case; several recent papers have documented patients with very poor semantics but very good reading of exception words (e.g. Cipolotti & Warrington, J Int Neuropsych Soc 1995 1 104-110). The problems here are more apparent than real. First regarding (b), because of the fact that the spelling-sound mechanism in our model IS capable of learning both the regular and exception words correctly, our model is able to handle cases in which there is severe semantic impairment and no surface dyslexia (See Plaut 97 citation below). We view the extent of reliance on semantics in reading words aloud as an premorbid individual difference variable. Regarding (a), we do not relax the claim that a single (spelling-sound) mechanism CAN accout for reading of both regular and exception items, we only suggest that readers CAN ALSO read via meaning, and this allows the spelling-sound system to be lazy in acquiring the hardest items, namely the low frequency exceptions; the extent of the laziness becomes parameter dependent, and thus a natural place for individual differences to arise within the context of the model. All agree that our models should account for disorders as well as unimpaired performance. Our model does account for one thing that the standard dual route model does not account for, which is the fact that all fluent (see note) surface dyslexia patients show spared reading of high frequency exceptions. According to the dual-route approach, it ought to be possible to eliminate exception word reading entirely, but there are no fluent surface dyslexia patients who exhibit this pattern. -- Jay McClelland and Dave Plaut note: we hope we all agree that non-fluent SD patients are not relevant to this debate... sorry if this begins to get technical! @article ( Plaut97, key = "Plaut" , author = "David C. Plaut" , year = "1997" , title = "Structure and Function in the Lexical System: {Insights} from Distributed Models of Naming and Lexical Decision" , journal = LCP , volume = 12 , pages = "767-808" , keywords= "semantics, reading" ) From kdh at anatomy.ucl.ac.uk Thu Aug 27 10:24:28 1998 From: kdh at anatomy.ucl.ac.uk (Ken Harris) Date: Thu, 27 Aug 1998 15:24:28 +0100 Subject: Neural networks and brain function Message-ID: <199808271424.PAA08096@ylem.anat.ucl.ac.uk> Michael Arbib writes: > a) Many many functions can exploit supervised learning. Thus to show > that A and B both use supervised learning in no way guarantees that the > cerebellum carries out both of them. Agreed. > b) Again, supervised learning can be realized in simple networks. We > still have many questions to answer (we do have partial answers) as to > why the cerebellar cortex has the structure it has, what might be the > specific advantages of the actual mix of LTD and "re-potentiation" at > the parallel-fiber-Purkinje cell synapse, and what is the relation > between cerebellar cortex, cerebellar nucleus and inferior olive in a > "generic" microcomplex. Even when we have answered that, we still have > to ask whether - for posited function A or B - there is a set of > cerebellar microcomplexes appropriately wired up to other brain regions > to realize the supervised adaptation of that function. Again agreed. Simple connectionist networks can be no more than an analogy for the functioning of the brain. Although sometimes it is possible to model the function of a brain structure with out modelling its circuitry. For example, some of Kawato's simulations model the cerebellum by a backprop net. > Conversely, recent work of ours suggests that, to fully understand its > role in navigation, we must embed HC in a larger system including parietal > cortex and other regions. It seems unlikely that the same pattern of > embedding will account for episodic memory. I absolutely agree with the need to embed the hippocampus in a larger system. But I do think this can be done in a way consistent with a role in episodic memory. Without getting too deep into the details of our model: We propose that the neocortex is responsible for constructing and representing an egocentric map of space, i.e. a firing pattern that codes for the egocentric position of environmental features. The hippocampus is an autoassociative memory that performs pattern completion on egocentric maps, as well as on more general firing patterns. This function may explain the involvement of the hippocampus in certain spatial tasks, as well as in general episodic memory. In the example of the Morris water maze, a rat introduced to the maze constructs a partial map from sensory input, that will contain the positions of observable cues but not the hidden platform. This will trigger the recall of a full map stored in the hippocampus during previous exploration, that also contains a representation of the platform location. After recall, the neocortical firing pattern will contain a representation of the platform location, even though it was not directly observed. Neocortical motor systems then allow the rat to head directly towards the platform. ----------------------------------------------- Ken Harris Department of Anatomy and Developmental Biology University College London http://www.anat.ucl.ac.uk/~kdh From adr at nsma.arizona.edu Thu Aug 27 15:01:03 1998 From: adr at nsma.arizona.edu (David Redish) Date: Thu, 27 Aug 1998 12:01:03 -0700 Subject: function of hippocampus In-Reply-To: Your message of "Tue, 25 Aug 1998 22:00:40 MST." <13795.38520.152296.309141@coltrane.ucsf.edu> Message-ID: <199808271901.MAA23599@cortex.NSMA.Arizona.EDU> Ken Miller wrote: >With respect to recent postings about models of hippocampus and >memory, I'd like to toss in a cautionary note. A recent report >(Elisabeth A. Murray and Mortimer Mishkin, "Object Recognition and >Location Memory in Monkeys with Excitotoxic Lesions of the Amygdala >and Hippocampus", J. Neuroscience, August 15, 1998, 18(16):6568-6582) >finds no deficit in tasks involving visual recognition memory or >spatial memory with lesions of hippocampus and amygdala. Instead, >deficits in both cases are associated with, and only with, lesion of >the overlying rhinal cortex. They mention in the discussion evidence >that "has suggested that the hippocampus may be more important for >path integration on the basis of self-motion cues than for location >memory, per se" (though Redish' recent posting mentions evidence >against this from recent experiments of Alyan and McNaughton; I >couldn't find a reference in medline). This is the latest in a series >of reports along these lines from the Mishkin lab, who did much of the >original lesion work that seemed to implicate hippocampus in memory. > >I'm not in any way an expert on this literature -- only a very distant >observer -- but I worry that, based on lesion studies that also >involved lesions of overlying cortex, both the neuroscience and >connectionists communities may have jumped to a wrong conclusion that >the hippocampus has a special role in episodic and/or spatial memory. >I'd be interested to know if there's still good reason to believe in >such a role ... One should be very careful about taking anything in the primate literature as bearing on spatial navigation. All of the primate hippocampal recordings and all primate hippocampal lesions studies have used primates looking at a constellation of objects or being moved about the room in chairs. Rodent studies have shown that hippocampal lesions affect environmental dependent tasks much more so than object dependent tasks (see, for example, Cassaday and Rawlins, 1997). Also, rats restrained by towels or laying in hammocks and passively moved around the room do not show normal place fields (Foster et al. 1989, Gavrilov et al. 1996). The specific task used by Murray and Mishkin (1998) was to find food from one of two wells placed in front of the animal. This task is not "spatial navigation"; it is spatial reasoning. Major distinctions can be drawn between this task and the kind of hippocampal-dependent tasks used in rodent navigation. (1) The task can be solved by an egocentric spatial reasoning system, while the rodent hippocampus seems to be critically involved in allocentric spatial reasoning. (2) The task is dependent on small objects in front of the animal, while the rodent navigation tasks dependent on the hippocampus require manipulations of environmental context. But there is another very nice result from Murray and Mishkin (1996) that does bear on this issue: Alvarez et al. (1994) tested primates with hippocampal lesions in a delayed-non-match-to-sample task (DNMS). Alvarez et al. found that their hippocampally lesioned animals were impaired at long delays (10 minutes and 40 minutes), but not short delays (8 sec, 40 sec, 1 minute). They interpreted this difference as a consequence of the length of the delay. However, they did not use the same experimental paradigm for the short and long delay trials: for the longer trials, they removed the monkey from the apparatus, put it back in its home cage during the delay, and returned it to the apparatus after the delay. If the hippocampus is critical for the reinstantiation of context on returning to an environment, we might expect this removal from the environment to strongly affect the hippocampally lesioned animals (Nadel, 1995, Redish, 1997). Murray and Mishkin (1996) tested exactly this: they used a continuous-non-match-to-sample task (CNMS) which is similar to the DNMS task except that animals are shown a sequence of example objects and then shown novel pairs in reverse order. This means that although there is a delay between the time the animal sees the first object and when it sees the corresponding last pair, the animal never leaves the experimental situation. Murray and Mishkin showed that if the animals do not leave the context, then they can perform the task well even without a hippocampus. This environmental context-change is, I think, a better analogy to the rodent navigation literature. adr PS. The Alyan et al. 1997 reference is to a neuroscience abstract. I don't know the current status of the paper they are writing based on that work. REFERENCES P. Alvarez and L. R. Squire (1994) Memory consolidation and the medial temporal lobe: A simple network model, Proceedings of the National Academy of Sciences, USA, 91:7041-7045. H. J. Cassaday and J. N. P. Rawlins (1997) The hippocampus, objects, and their contexts, Behavioral Neuroscience, 111(6):1228-1244. T. C. Foster, C. A. Castro and B. L. McNaughton (1989) Spatial selectivity of rat hippocampal neurons: Dependence on preparedness for movement, Science, 244:1580-1582. V. V. Gavrilov, S. I. Wiener and A. Berthoz (1996) Discharge correlates of hippocampal neurons in rats passively displaced on a mobile robot, Society for Neuroscience Abstracts, 22:910. E. A. Murray and M. Mishkin (1996) 40-minute visual recognition memory in rhesus monkeys with hippocampal lesions, Society for Neuroscience Abstracts, 22:281. L. Nadel, The role of the hippocampus in declarative memory: A commentary on Zola-Morgan, Squire, and Ramus, 1994, Hippocampus, 5:232-234. A. D. Redish (1997) Beyond the Cognitive Map: Contributions to a Computational Neuroscience Theory of Rodent Navigation. PhD Thesis. Carnegie Mellon University. S. Zola-Morgan and L. R. Squire and S. J. Ramus (1994) Severity of memory impairment in monkeys as a function of locus and extent of damage within the medial temporal lobe memory system, 4:483-495. From dblank at comp.uark.edu Thu Aug 27 14:45:05 1998 From: dblank at comp.uark.edu (Douglas Blank) Date: Thu, 27 Aug 1998 13:45:05 -0500 Subject: Connectionist symbol processing: any progress? References: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> Message-ID: <35E5A931.F6E72BE0@comp.uark.edu> Dave_Touretzky at cs.cmu.edu wrote: > I'd like to start a debate on the current state of connectionist > symbol processing? Is it dead? Or does progress continue? For me, it is dead.  Implementing symbol processing in networks was a good first step in solving many problems that plagued symbolic systems. Tony Plate's HRR as applied to analogy is a great example (Plate, 1993). Using connectionist representations and methodologies, an expensive symbolic similarity estimation process was eliminated in the analogy-making MAC/FAC system (Gentner and Forbus, 1991). The bad news is that, in my opinion, the entire MAC/FAC model (like many symbolic models) has a fatal flaw and will never lead to an autonomous, flexible, creative, intelligent (analogy-making) machine. Even if Gentner's entire model were implemented completely in a network (or even real neurons), their problem would remain: the overall system organization is still "symbolic". Their method requires that analogies be encoded as symbols and structures, which leaves no room for perception or context effects during the analogy making process (for a detailed description of this problem, see Hofstadter, 1995). I believe that in order to solve the big AI/cognitive problems ahead (like making analogies), we, as modelers, will have to face a radical idea: we will no longer understand how our models solve a problem exactly. I mean that, for many complex problems, systems that solve them won't be able to be broken down into symbols and modules, and, therefore, there may not be a description of the solution more abstract than the actual solution itself. Some researchers have been focusing on solving high-level problems via a purely connectionist framework rather than augmenting a symbolic one. Meeden's planning system comes to mind, as does (warning: self-promotion) my own work in analogy-making (Meeden, 1994; Blank, 1997). Rather than focusing on some assumed-necessary symbolically-based process (say, variable binding) these models look at a bigger goal: modeling a complex behavior. Building and manipulating structured representations or binding variables via networks should not be our goals.* Neither should creating a model such that we can understand its inner workings.** Rather, we should focus on the techniques that allow a system to self-organize such that it can solve The Bigger Problems. I think much of the discussion on "learning to learn" has been related to this issue. > I'd love to hear some good news. For me, "connectionist symbol processing" was a very useful stage I went through as a cognitive scientist. Now I see that networks can do the equivalent of processing symbols, and not have anything to do with symbols. In addition, I learned that I can feel ok about not understanding exactly how they do it. -Doug Blank *Of course, building and manipulating structured representations or binding variable via nets is still useful for some problems, just not all of them. **The DOD is not interested in these types of systems. References Blank, D.S. (1997). "Learning to see analogies: a connectionist exploration." Unpublished PhD Thesis, Indiana University, Bloomington. http://dangermouse.uark.edu/~dblank/thesis.html Gentner, D., and Forbus, K. (1991). MAC/FAC: a model of similarity-based access and mapping. In "Proceedings of the Thirteenth Annual Cognitive Science Conference," 504-9. Hillsdale, NJ: Lawrence Erlbaum. Hofstadter, D., and FARG (1995). "Fluid concepts and creative analogies." Basic Books, new York, NY. Meeden, L. (1994) "Towards planning: incremental investigations into adaptive robot control." Unpublished PhD Thesis, Indiana University, Bloomington. http://www.cs.swarthmore.edu/~meeden/ Plate, T.A. (1991). Holographic reduced representations: convolution algebra for compositional distributed representations. In "Proceedings of the Twelfth International Joint Conference on Artificial Intelligence." Myopoulos, J. and Reiter, R. (Eds.), pp. 30-35. Morgan Kaufmann. -- ===================================================================== dblank at comp.uark.edu Douglas Blank, University of Arkansas Assistant Professor Computer Science ==================== http://www.uark.edu/~dblank ==================== From zorzi at univ.trieste.it Thu Aug 27 15:21:05 1998 From: zorzi at univ.trieste.it (Marco Zorzi) Date: Thu, 27 Aug 1998 20:21:05 +0100 Subject: What have neural networks achieved? In-Reply-To: <199808251826.OAA27165@CNBC.CMU.EDU> Message-ID: <3.0.5.32.19980827202105.007c2cd0@uts.univ.trieste.it> Jay McClelland writes: >There has been a great deal of connectionist work on the processing of >regular and exceptional material, initiated by the >Rumelhart-McClelland paper on the past tense. Debate has raged on the >subject of the past tense and work there is ongoing, but I won't claim >a success story there at this time. What I would like to point to >instead is the related topic of single word reading. Sejnowski and >Rosenberg's NETTALK first extended connectionist ideas to this issue, >and Seidenberg and McClelland went on to show that a connectionist >model could account in great detail for the pattern of reaction times >found in around 30 studies concerning the effects of regularity, >frequency, and lexical neighbors on reading words aloud. This was >followed by a resounding critique along the lines of Pinker and >Prince's critique of R&M, coming this time from Derrick Besner (and >colleagues) and Max Coltheart (and colleagues). Both pointed to the >fact that the S&M model didn't do a very good job of reading nonwords, >and both claimed that this reflected an in-principal limitation of a >connectionist, single mechanism account: To do a good job with both, >it was claimed, a dual route system was required. > >The success story is a paper by Plaut, McClelland, Seidenberg, and >Patterson, in which it was shown in fact that a single mechanism, >connectionist model can indeed account for human performance in >reading both words and nonwords. The model replicated all the S&M >findings, and at the same time was able to read non-words as well as >human subjects, showing the same types of neighbor-driven responses that >human readers show (eg MAVE is sometimes read to rhyme with HAVE >instead of SAVE). > >Of course there are still some loose ends but it is no longer possible >to claim that a single-mechanism account cannot capture the basic >pattern of word and non-word reading data. The demonstration that a single mechanism (ie a single, uniform network) can deal with both regular and exception items does not speak to the issue of which system humans are more likely to posses. For example, it is easy to constrain a single backpropagation network to perform both "what" and "where" vision tasks (Rueckl, Cave & Kosslyn, 1989), but the most efficient way to do it is through a modular architecture (Jacobs, Jordan, & Barto, 1991); incidentally, this is also what the brain seems be doing (in a very broad sense). This is the general (and important) issue of modular decomposition in learning (see Ghahramani & Wolpert, 1997, for recent evidence that the brain uses a modular decomposition strategy to learn a new visuomotor task). With regard to the more specific issue of regular vs. exception and/or words vs. non-words, a modular connectionist perspective (alternative to the approach of Plaut, McClelland, Seidenberg, & Patterson, 1996) can be found in papers (just appeared) by Zorzi, Houghton, and Butterworth (1998a, 1998b) for reading, and by Houghton and Zorzi (1998) for spelling (refs and abstracts below). The main point here is that the regularities of a "quasi-regular" domain such as reading or spelling are more easily and quickly exctracted by a network without hidden units and trained with the simple delta rule; this also provides early and robust generalization to novel forms (eg, non-words). The reading model has been shown to account for a wide range of empirical findings, including experimental, neuropsychological and developmental data. -- Marco Zorzi References: Ghahramani, Z., & Wolpert, D.M. (1997). Modular decomposition in visuomotor learning. Nature, 386, 392-395. Houghton, G., & Zorzi, M. (1998). A model of the sound-spelling mapping in English and its role in word and nonword spelling. In Proceedings of the Twentieth Annual Conference of the Cognitive Science Society (p. 490-501). Mahwah (NJ): Erlbaum. Jacobs, R.A., Jordan, M.I., & Barto, A.G. (1991). Task decomposition through competition in a modular connectionist architecture: The What and Where vision tasks. Cognitive Science, 15, 219-250. Plaut, D. C., McClelland, J. L., Seidenberg, M. S. & Patterson, K. E. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domain. Psychological Review, 103, 56-115. Rueckl, J.G., Cave, K.R., & Kosslyn, S.M. (1989). Why are "What" and "Where" processed by separate cortical visual systems? A computational investigation. Journal of Cognitive Neuroscience, 1, 171-186. Zorzi, M., Houghton, G., & Butterworth, B. (1988a). Two routes or one in reading aloud? A connectionist dual-process model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1131-1161. Zorzi, M., Houghton, G., & Butterworth, B. (1988b). The development of spelling-sound relationships in a model of phonological reading. Language and Cognitive Processes (Special Issue: Language Acquisition and Connectionsim), 13, 337-371. Two Routes or One in Reading Aloud? A Connectionist Dual-Process Model Marco Zorzi, George Houghton and Brian Butterworth Journal of Experimental Psychology: Human Perception and Performance, 1998, Vol. 24, No. 4, 1131-1161 A connectionist study of word reading is described that emphasizes the computational demands of the spelling-sound mapping in determining the properties of the reading system. It is shown that the phonological assembly process can be implemented by a two-layer network, which easily extracts the regularities in the spelling-sound mapping for English from training data containing many exception words. It is argued that productive knowledge about spelling-sound relationships is more easily acquired and used if it is separated from case-specific knowledge of the pronunciation of known words. It is then shown how the interaction of assembled and retrieved phonologies can account for the combined effects of frequency and regularity-consistency and for the reading performance of dyslexic patients. It is concluded that the organization of the reading system reflects the demands of the task and that the pronunciations of nonwords and exception words are computed by different processes. The development of spelling-sound relationships in a model of phonological reading. Marco Zorzi, George Houghton and Brian Butterworth Language and Cognitive Processes (Special Issue: Language Acquisition and Connectionsim), 1998, Vol. 13 (2/3), 337-371. Developmental aspects of the spelling to sound mapping for English monosyllabic words are investigated with a simple 2-layer network model using a simple, general learning rule. The model is trained on both regularly and irregularly spelled words, but extracts the regular spelling to sound relationships which it can apply to new words, and which cause it to regularize irregular words. These relationships are shown to include single letter to phoneme mappings as well as mappings involving larger units such as multi-letter graphemes and onset-rime structures. The development of these mappings as a function of training is analyzed and compared with relevant developmental data. We also show that the 2-layer model can generalize after very little training, in comparison to a 3-layer network. This ability relies on the fact that orthography and phonology can make direct contact with each other, and its importance for self-teaching is emphasized. A model of the sound-spelling mapping in English and its role in word and nonword spelling. George Houghton and Marco Zorzi In: Proceedings of the Twentieth Annual Conference of the Cognitive Science Society (p. 490-501), 1998. A model of the productive sound-spelling mapping in English is described, based on previous work on the analogous problem for reading (Zorzi, Houghton & Butterworth, 1998a, 1998b). It is found that a two-layer network can robustly extract this mapping from a representative corpus of English monosyllabic sound-spelling pairs, but that good performance requires the use of graphemic representations. Performance of the model is discussed for both words and nonwords, direct comparison being made with the spelling of surface dysgraphic MP (Behrmann & Bub, 1992). The model shows appropriate contextual effects on spelling and exactly reproduces many of the subject’s spellings. Effects of sound-spelling consistency are examined, and results arising from the interaction of this system with a lexical spelling system are compared with normal subject data. ---------------------------------------------------------------------- Marco Zorzi email: marco at psychol.ucl.ac.uk http://www.psychol.ucl.ac.uk/marco.zorzi/marco.html Department of Psychology voice: +44 171 5045393 University College London fax : +44 171 4364276 Gower Street London WC1E 6BT (UK) (and) Dipartimento di Psicologia voice: +39 40 6767325 Universita` di Trieste fax : +39 40 312272 via dell'Universita` 7 email: zorzi at univ.trieste.it 34123 Trieste (Italy) ---------------------------------------------------------------------- From jagota at cse.ucsc.edu Thu Aug 27 21:42:49 1998 From: jagota at cse.ucsc.edu (Arun Jagota) Date: Thu, 27 Aug 1998 18:42:49 -0700 Subject: a second, similar proposal Message-ID: <199808280142.SAA09611@arapaho.cse.ucsc.edu> As with my earlier call for archivable contributions on the "connectionist symbol processing" debate (the call generated a sufficient response to proceed to the implementation phase) it seems to me there are valuable postings also in the "big success stories" thread which would be nice to collect together into an article. In view of this, I make a similar proposal: If you have a "big success story" and are interested in making a half- to one-page archival contribution towards a "distributed" article that collects "big success stories", e-mail me your story in plain text. Sending me what you already posted on Connectionists (if you did) is fine. A mild preference to place all references in bibtex at the end and use latex commands where appropriate. A "big success story" may be in engineering applications or in understanding the brain. I intend to keep the two separate. Contributions would be rapidly reviewed for minimal content. Soft deadline: 9/7/98. The implementation phase will depend on quality and quantity of contributions. Should this phase be entered, the resulting article(s) will be archived in Neural Computing Surveys, http://www.icsi.berkeley.edu/~jagota/NCS Arun Jagota jagota at cse.ucsc.edu From ken at phy.ucsf.EDU Fri Aug 28 02:02:19 1998 From: ken at phy.ucsf.EDU (Ken Miller) Date: Thu, 27 Aug 1998 23:02:19 -0700 (PDT) Subject: What have neural networks achieved? Message-ID: <13798.18411.850138.465168@coltrane.ucsf.edu> >>>>> "-" == Terry Sejnowski writes: -> Regarding the comment by Ken Miller, the regions of the cortex that -> surround the hippocampus, including the entorhinal cortex, the -> perirhinal cortex and the parahippocampal cortex are staging areas -> for converging inputs to the hippocampus. Stuart Zola has shown -> that the severity of amnesia folowing lesions of these areas in monkeys is -> greater as more surrounding cortical areas are included in the lesion. -> The famous case of HM had surgical removal of the tmeporal lobe which -> included the areas surrounding the hippocampus. The view in the field is no -> longer to think of the hippocampus as the primary site but as part -> of a memory system in reciprocal interaction with these cortical -> areas. It's clear from the lesion studies that the surrounding pieces of cortex are involved in memory. The problem is, what's the evidence that hippocampus itself is involved in memory? -- given that when Mishkin lesions hippocampus, there is no deficit in visual recognition or spatial memory. Is the main evidence just that it's in heavy recipricol interaction with the places that *are* implicated by lesion studies in memory? Randy O'Reilly suggested Mishkin's results could be explained by postulating the overlying cortex can handle 'familiarity', and that is enough for the memory tasks studied. Whether it is enough for those tasks is a separate question, but even if so the logic, if I understand it right, seems a little tortured to me (though not impossible): lesioning overlying cortex affects measurements of memory bacause it destroys all those inputs to hippocampus; yet lesioning hippocampus itself doesn't affect those same measurements of memory bacause the overlying cortex can do some weak memory-like things by itself. It seems a pretty convoluted set of reasoning to preserve the idea that hippocampus is critical to memory, compared to the simpler conclusion that the overlying cortex is critical to memory and hippocampus just isn't involved. So again, is there positive evidence for *hippocampal* involvement in memory? There may well be some, but I don't think I've heard it yet ... don't mean to be argumentative, just really wondering ... Ken Kenneth D. Miller telephone: (415) 476-8217 Dept. of Physiology fax: (415) 476-4929 UCSF internet: ken at phy.ucsf.edu 513 Parnassus www: http://www.keck.ucsf.edu/~ken San Francisco, CA 94143-0444 From jlm at cnbc.cmu.edu Fri Aug 28 08:01:32 1998 From: jlm at cnbc.cmu.edu (Jay McClelland) Date: Fri, 28 Aug 1998 08:01:32 -0400 (EDT) Subject: What have neural networks achieved? In-Reply-To: <13798.18411.850138.465168@coltrane.ucsf.edu> (message from Ken Miller on Thu, 27 Aug 1998 23:02:19 -0700 (PDT)) Message-ID: <199808281201.IAA22968@CNBC.CMU.EDU> I reply to Ken's question: There are huge difficulties associated with the determination of whether or not hippocampal lesions produce deficits in memory, expecially when categorical labels are used such as 'recognition memory' or 'spatial memory' or 'episodic memory' or whatever. These difficulties include the fact that there are few lesions that are totally foolproof in terms of either their selectivity or their completeness. The difficulties also include the fact that the particular parameters of tasks often make a tremendous difference and labels such as recognition memory, spatial memory, etc really aren't fully adequate to characterize which tasks do and which tasks do not show deficits. It seems to be pretty well established in the rodent literature, however, that ibotinate lesions of the hippocampus (which leave fibers of passage intact) produce profound deficits in many spatial tasks (e.g., animal must learn to find submerged platform in a tank of milky water using cues around the room, and starting from a location in the tank that varies from trial to trial), although again the effects depend on the details of the experiments (the animal can learn if there's a lightbulb over the platform or if training and testing are always done with the same fixed starting place so a fixed path can be used). Much less clear is whether such lesions also produce deficits in non-spatial tasks. In my view the literature is consistent with the idea that the hippocampus is crucial for *rapid* learning that depends on conjuctions of cues; indeed I am one of many who think that the role of the hippocampus in spatial tasks is secondary to its role in using a form of coding we call sparse random conjunctive coding, but this matter is far from settled. There is an overview of the empricial data through 1994 in the McClelland, McNaughton and O'Reilly paper (Psychological Review, 1995, 102, 419-457). I append a couple of other relevant citations. -- Jay McClelland @Article{Jarrard93, author = "Jarrard, L. E.", title = "On the role of the hippocampus in learning and memory in the rat", journal = "Behavioral and Neural Biology", year = 1993, volume = 60, pages = "9-26" } @Article{RudySutherland9X, author = "Rudy, J. W. and Sutherland, R. W.", title = "Configural Association Theory and the Hippocampal Formation: {An} Appraisal and Reconfiguration", journal = "Hippocampus", year = "199?" %I think it's 95 or 96 -- jlm } From lemm at lorentz.uni-muenster.de Fri Aug 28 08:24:52 1998 From: lemm at lorentz.uni-muenster.de (Joerg_Lemm) Date: Fri, 28 Aug 1998 14:24:52 +0200 (MEST) Subject: Paper: A priori information, statistical mechanics Message-ID: The following technical report is available at: http://pauli.uni-muenster.de/~lemm or directly: http://pauli.uni-muenster.de/~lemm/papers/ann98.ps.gz or at the Los Alamos preprint server as cond-mat/9808039 (gzipped ps-file): http://xxx.lanl.gov/ps/cond-mat/9808039 Joerg C. Lemm: "How to Implement A Priori Information: A Statistical Mechanics Approach" Generalization abilities of empirical learning systems are essentially based on a priori information. The paper emphasizes the need of empirical measurement of a priori information by a posteriori control. A priori information is treated analogously to an infinite number of training data and expressed explicitly in terms of the function values of interest. This contrasts an implicit implementation of a priori information, e.g., by choosing a restrictive function parameterization. Different possibilities to implement a priori information are presented. Technically, the proposed methods are non--convex (non--Gaussian) extensions of classical quadratic and thus convex regularization approaches (or Gaussian processes, respectively). Specific topics discussed include approximate symmetries, approximate structural assumptions, transfer of knowledge and combination of learning systems. Appendix A compares concepts of statistics and statistical mechanics. Appendix B relates the paper to the framework of Bayesian decision theory. University of Muenster Publication No.: MS-TP1-98-12 Available at: http://pauli.uni-muenster.de/~lemm or directly: http://pauli.uni-muenster.de/~lemm/papers/ann98.ps.gz or at the Los Alamos preprint server as cond-mat/9808039 (gzipped ps-file): http://xxx.lanl.gov/ps/cond-mat/9808039 ======================================================================== Dr. Joerg Lemm Universitaet Muenster Email: lemm at uni-muenster.de Institut fuer Theoretische Physik I Phone: +49(251)83-34922 Wilhelm-Klemm-Str.9 Fax: +49(251)83-36328 D-48149 Muenster, Germany http://pauli.uni-muenster.de/~lemm ======================================================================== From harnad at coglit.soton.ac.uk Fri Aug 28 12:28:47 1998 From: harnad at coglit.soton.ac.uk (Stevan Harnad) Date: Fri, 28 Aug 1998 17:28:47 +0100 (BST) Subject: Barsalou on Perceptual Symbol Systems: BBS Call for Commentary Message-ID: Below is the abstract of a forthcoming BBS target article on: PERCEPTUAL SYMBOL SYSTEMS by Lawrence W. Barsalou This article has been accepted for publication in Behavioral and Brain Sciences (BBS), an international, interdisciplinary journal providing Open Peer Commentary on important and controversial current research in the biobehavioral and cognitive sciences. Commentators must be BBS Associates or nominated by a BBS Associate. To be considered as a commentator for this article, to suggest other appropriate commentators, or for information about how to become a BBS Associate, please send EMAIL to: bbs at cogsci.soton.ac.uk or write to: Behavioral and Brain Sciences Department of Psychology University of Southampton Highfield, Southampton SO17 1BJ UNITED KINGDOM http://www.princeton.edu/~harnad/bbs/ http://www.cogsci.soton.ac.uk/bbs/ ftp://ftp.princeton.edu/pub/harnad/BBS/ ftp://ftp.cogsci.soton.ac.uk/pub/bbs/ gopher://gopher.princeton.edu:70/11/.libraries/.pujournals If you are not a BBS Associate, please send your CV and the name of a BBS Associate (there are currently over 10,000 worldwide) who is familiar with your work. All past BBS authors, referees and commentators are eligible to become BBS Associates. To help us put together a balanced list of commentators, please give some indication of the aspects of the topic on which you would bring your areas of expertise to bear if you were selected as a commentator. An electronic draft of the full text is available for inspection with a WWW browser, anonymous ftp or gopher according to the instructions that follow after the abstract. ____________________________________________________________________ PERCEPTUAL SYMBOL SYSTEMS Lawrence W. Barsalou Department of Psychology Emory University Atlanta, GA 30322 http://userwww.service.emory.edu/~barsalou/ barsalou at emory.edu KEYWORDS: analogue processing, categories, concepts, frames, imagery, images, knowledge, perception, representation, sensory-motor representations, simulation, symbol grounding, symbol systems ABSTRACT: Prior to the twentieth century, theories of knowledge were inherently perceptual. Since then, developments in logic, statistics, and programming languages have inspired amodal theories that rest on principles fundamentally different from those underlying perception. In addition, perceptual approaches have become widely viewed as untenable, because they are assumed to implement recording systems, not conceptual systems. A perceptual theory of knowledge is developed here in the contexts of current cognitive science and neuroscience. During perceptual experience, association areas in the brain capture bottom-up patterns of activation in sensory-motor areas. Later, in a top-down manner, association areas partially reactivate sensory-motor areas to implement perceptual symbols. The storage and reactivation of perceptual symbols operates at the level of perceptual components--not at the level of holistic perceptual experiences. Through the use of selective attention, schematic representations of perceptual components are extracted from experience and stored in memory (e.g., individual memories of green, purr, hot). As memories of the same component become organized around a common frame, they implement a simulator that produces limitless simulations of the component (e.g., simulations of purr). Not only do such simulators develop for aspects of sensory experience, they also develop for aspects of proprioception (e.g., lift, run) and for introspection (e.g., compare, memory, happy, hungry). Once established, these simulators implement a basic conceptual system that represents types, supports categorization, and produces categorical inferences. These simulators further support productivity, propositions, and abstract concepts, thereby implementing a fully functional conceptual system. Productivity results from integrating simulators combinatorially and recursively to produce complex simulations. Propositions result from binding simulators to perceived individuals to represent type-token relations. Abstract concepts are grounded in complex simulations of combined physical and introspective events. Thus, a perceptual theory of knowledge can implement a fully functional conceptual system while avoiding what it is becoming increasingly apparent would be problems for amodal symbol systems. Implications for cognition, neuroscience, evolution, development, and artificial intelligence are explored. -------------------------------------------------------------- To help you decide whether you would be an appropriate commentator for this article, an electronic draft is retrievable from the World Wide Web or by anonymous ftp or gopher from the US or UK BBS Archive. Ftp instructions follow below. Please do not prepare a commentary on this draft. Just let us know, after having inspected it, what relevant expertise you feel you would bring to bear on what aspect of the article. The URLs you can use to get to the BBS Archive: http://www.princeton.edu/~harnad/bbs/ http://www.cogsci.soton.ac.uk/bbs/Archive/bbs.barsalou.html ftp://ftp.princeton.edu/pub/harnad/BBS/bbs.barsalou ftp://ftp.cogsci.soton.ac.uk/pub/bbs/Archive/bbs.barsalou gopher://gopher.princeton.edu:70/11/.libraries/.pujournals To retrieve a file by ftp from an Internet site, type either: ftp ftp.princeton.edu or ftp 128.112.128.1 When you are asked for your login, type: anonymous Enter password as queried (your password is your actual userid: yourlogin at yourhost.whatever.whatever - be sure to include the "@") cd /pub/harnad/BBS To show the available files, type: ls Next, retrieve the file you want with (for example): get bbs.barsalou When you have the file(s) you want, type: quit From zhuh at santafe.edu Fri Aug 28 17:37:15 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Fri, 28 Aug 1998 15:37:15 -0600 (MDT) Subject: What have neural networks achieved? In-Reply-To: Message-ID: As to the objective of a better understand of the brain, I'd like to draw your attention to a result about The Computational Origin of Addiction. A learning algorithm derived from purely computational considerations was shown to require a particular mechanism reminiscent to that provided by the neurotransmitter dopamine, including the possibility of addiction. I'd be especially interested in hearing responses from people familiar with neurophysiology and the role of dopamine. It might even be possible to test this theory with current experimental technology. ftp://ftp.santafe.edu/pub/zhuh/link-iconip97.ps A possible link between artificial and biological neural network learning rules Huaiyu Zhu A learning rule for stochastic neural networks is described, which corresponds to biological neural systems in all major aspects. Instead of backpropagating a vector through the synapses, only a few scalars are broadcast across the whole network, corresponding to the role played by the neurotransmitter dopamine. In addition, the annealing process avoids local optima in the learning process and corresponds to the difference in learning between adults and children. Some more detailed predictions are made for future comparison with neurophysiological data. (In Proc. Intl. Conf. Neural Information Processing (ICONIP'97), Vol.1, pp.263-266. Dunedin, New Zealand, 28-30 Nov, 1997) Huaiyu -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From larryy at pobox.com Fri Aug 28 18:41:11 1998 From: larryy at pobox.com (Larry Yaeger) Date: Fri, 28 Aug 1998 17:41:11 -0500 Subject: What have neural networks achieved? In-Reply-To: Message-ID: At 2:07 PM -0800 8/14/98, Michael A. Arbib wrote: >b) What are the "big success stories" (i.e., of the kind the general public >could understand) for neural networks contributing to the construction of >"artificial" brains, i.e., successfully fielded applications of NN hardware >and software that have had a major commercial or other impact? I sent a private comment to Michael Arbib, but since I never announced the availability of the comprehensive technical paper on connectionists, I'll briefly pipe up now. Though I wouldn't call it an "artificial brain", I would call it a successfully fielded application of NN software that had some degree of commercial and technological impact... The "Print Recognizer" in second and subsequent generation Newton PDAs was neural network based, and was fairly widely regarded as the first successful, truly usable handwriting recognition solution. (It had nothing whatsoever to do with the original handwriting recognition system in first generation Newtons.) When it was introduced, this handwriting recognizer essentially "saved" the Newton, breathing new life into the product and bringing a level of public acceptance of the device's primary input method (even though the product was killed a few years later). A fairly detailed technical paper on the subject is available in: Yaeger, L. S., Webb, B. J., Lyon, R. F., Combining Neural Networks and Context-Driven Search for On-Line, Printed Handwriting Recognition in the Newton, AI Magazine, AAAI, 19:1 (Spring 1998) p73-89. Or in preprint form at: Other information on this system and other, more basic research on evolving neural network architectures in a computational ecology can easily be found through my personal web site (URL in .sig below). - larryy ------------------------- "'I heard the Empire has a tyrannical and repressive government!' 'What form of government is that?' 'A tautology.'" ------------------------- "One of these days... Milkshake!... BANG!!" From dcrespin at euler.ciens.ucv.ve Sat Aug 29 07:25:25 1998 From: dcrespin at euler.ciens.ucv.ve (Daniel Crespin(UCV) Date: Sat, 29 Aug 1998 15:25:25 +0400 Subject: What have neural networks achieved? Message-ID: <199808291125.PAA20828@gauss.ucv.ve> About "What have neural networks achieved?", here is a condensed personal viewpoint, particularly about forward pass perceptron neural networks. As you will see below, I expect this e-mail to motivate not only thoughts but also certain concrete action. In order attain perspective, ask the following similar question: "What have computers achieved?" and compare with answers to the previous question. First came the birth of perceptrons. An elegant model for nervous systems, it caught lots of attention. Just after Hitler, the Holocaust and Hiroshima, the possibility of in-depth understanding of the human brain and behaviour could not pass unnoticed. But a persuasive book *against* perceptrons was written, and for some time they were left outside mainstream science. Then, backpropagation was created. A learning algorithm, a paradigm, a source of projects. The field of neural networks was (re)born. In the last analysis, backpropagation is just a special mixture of the gradient method (GM) and the chain rule, inheriting all the difficulties and shortcomings of the former. The classical picture of GM is: High dimensional landscapes with hills, saddles and valleys, starting at a random point and moving downhill towards a local minimum that one one does not know if it is a global one. Or perhaps wandering away towards infinity. Or unadvertedly jumping over the sought-after well. And then, to apply backpropagation, the network architecture has to be defined in advance, a task for which no efficient method has been available. Hence the random starting weights, and the random topology, or the "educated guess", or just the "guess". This means that lots of gaps are left to be filled, which may be good or bad, depending on projects and levels. Number crunching power has been a popular remedy, but the task is rather complex and results are still not satisfactory. This is, with considerable simplification, a possible sketch of the neuroscape to this date. The rather limited (as compared with computers in general) lists of successes previosly forwarded as answers to the Subject of this e-mail debate gives a rather good picture of what NN's achieved. Imagine now a new, powerful insight into (forward pass perceptron) neural networks. A whole new way to interpret and look at the popular diagrams of dots, arrows and weights, that gives you a useful and substantial picture of what a neural network is, what it does, what can you expect from it. As soon as data are gathered, your program creates a network and there you go. No more architecture or weight guessing. No more tedious backpropagation. No more thousands of "presentations". But wait. Why waste your time with hype? Not only the theory, but the software itself is readily available. Go the following URL: http://euler.ciens.ucv.ve/~dcrespin/Pub or http://150.185.69.150/~dcrespin/Pub Go there and download NEUROGON software. This is the action I expect to motivate. It is free for academic purposes and for any other non-profit use. The available version of NEUROGON can be greatly improved, but even thisrather limited version, once it is tested and used by workers on the field, could give rise to a much larger list of success stories on neural networks. Regards Daniel Crespin From istvan at usl.edu Sun Aug 30 14:53:59 1998 From: istvan at usl.edu (Dr. Istvn S. N. Berkeley) Date: Sun, 30 Aug 1998 13:53:59 -0500 Subject: Connectionist symbol processing: any progress? References: <35D1E00A.7B1CC6A2@icsi.berkeley.edu> <35E5A931.F6E72BE0@comp.uark.edu> Message-ID: <35E99FC7.3025@USL.edu> Hi there, I am afraid that I can no longer resist adding my 2 cents worth to this debate. Douglas Blank wrote: > I believe that in order to solve the big AI/cognitive problems ahead > (like making analogies), we, as modelers, will have to face a radical > idea: we will no longer understand how our models solve a problem > exactly. I mean that, for many complex problems, systems that solve them > won't be able to be broken down into symbols and modules, and, > therefore, there may not be a description of the solution more abstract > than the actual solution itself. It seems to me that there is something fundamentally wrong about the proposal here. As McCloskey (1991) has argued, unless we can develop an understanding of how network models (or any kind of model for that matter) go about solving problems, they will not have any useful impact upon cognitive theorizing. Whilst this may not be a problem for those who wish to use networks merely as a technology, it surely must be a concern to those who wish to deploy networks in the furtherment of cognitive science. If we follow the suggestion made above then even successful attempts at modelling will be theoretically sterile, as we will be creating nothing more than 'black boxes'. This much having been said, the problem of interpreting and analysing trained network systems is not a trivial one, especially for large scale models. Although there are a variety of techniques which have been deployed (See Berkeley et al. 1995, Hanson and Burr 1990 and Elman 1990, for examples), none of them are entirely satisfactory, or universally applicable. Indeed, there has been some skepticism about the feasibility of trained network analysis in the literature (See Hecht-Nielsen 1990, Moser and Smolensky 1989 and Robinson, 1992). Nonetheless, if connectionist networks are to prove useful to cognitive science, continuing efforts to better understand mature networks are going to be crucial. A further point which needs to be raised (and sorry, this is where the self-advertising begins) is that some efforts at trained network analysis have turned up surprising results, which seem highly germaine to the topic of connectionist symbol processing. Some years ago myself and a number of members of The Biological Computation Project undertook the analysis of a network trained upon a logic problem originally studied by Bechtel and Abrahamsen (1991). Much to our surprise, our analysis showed that the network had developed stable patterns of hidden unit activation which closely mirrored the standard rules of inference from traditional sentential calculus, such as Modus Ponens. Moreover, we were able to make a number of useful and abstract generalizations about network functioning which were novel and informative. These results directly challenged the conclusions orignially drawn about the task by Bechtel and Abrahasen (1991). This work is described in detail in Berkeley et al. (1995). What this work suggests is that, rather than abandoning attempts at understanding mature networks, a more rational and productive path is to attempt to analyse in detail trained network function and then use the empirical results from such studies to inform judgements about connectionist symbol processing. All the best, Istvan Bibliography Bechtel, W. and Abrahamsen, A. (1991), *Connectionism and the Mind*, Basil Blackwell (Cambridge, MA). Berkeley, I., Dawson, M., Medler, D. Schopflocher, D. and Hornsby, L. (1995) "Density Plots of Hidden Value Unit Activations Reveal Interpretable Bands" in *Connection Science* 7/2, pp. 167-186. Elman, J. (1990), "Finding Structure in Time", in *Cognitive Science* 14, pp. 179-212. Hanson, S. and Burr, D. (1990), "What Connectionist Models Learn: Learning and Representation in Connectionist Networks" in Behavioral and Brain Sciences 13, pp. 471-518. Hecht-Nielsen, R. (1990), *Neurocomputation* Addison-Wesley Pub. Co. (New York). McCloskey, M. (1991), "Networks and Theories: The Place of Connectionism in Cognitive Science", *Psychological Science* 2/6, pp. 387-395. Mozer, M. and Smolensky, P. (1989), "Using Relevance to Reduce Network Size Automatically", in *Connection Science* 1, pp. 3-16. Robinson, D. (1992) "Implications of Neural Networks for How we Think about Brain Function" in Behavioral and Brain Sciences, 15, pp. 644-655. -- Istvan S. N. Berkeley Ph.D, E-mail: istvan at USL.edu, Philosophy, The University of Southwestern Louisiana, USL P. O. Box 43770, Lafayette, LA 70504-3770, USA. Tel:(318) 482 6807, Fax: (318) 482 6195, http://www.ucs.usl.edu/~isb9112 From adamidis at egnatia.ee.auth.gr Mon Aug 31 05:45:32 1998 From: adamidis at egnatia.ee.auth.gr (Panagiotis Adamidis) Date: Mon, 31 Aug 1998 12:45:32 +0300 Subject: New digital library Message-ID: <199808310945.MAA20151@egnatia.ee.auth.gr> ********Apologies if you receive multiple copies of this email********** Dear colleques, I'd like to inform you of a new "digital library" available on the following URL: http://www.it.uom.gr/pdp/digital.htm. It contains a lot of resources on the following subjects: Artificial Life, Complex Systems, Evolutionary Computation, Fuzzy Systems, Neural Networks, Parallel and Distributed Processing. Our initial intention was to provide a library (on topics of interest to our lab) usefull, effective, easy-to-use, up-to-date and attractive to both non-experts and specialists. We hope that the final(?) result is at least usefull. We intend to enhance and keep the library up-to-date. This is very difficult (maybe impossible) without the feedback from the users of this library. Your additions/corrections/deletions would be greatly appreciated. The library is maintained at the Parallel & Distributed Processing Lab. of the department of Applied Infomatics of Univ. of Macedonia, Thessaloniki, Greece. Hope you use it, and send us your feedback. Panagiotis Adamidis Associate researcher, Dept. of Applied Informatics, Univ. of Macedonia Thessaloniki, Greece. Email: adamidis at uom.gr From lorincz at iserv.iki.kfki.hu Mon Aug 31 06:55:44 1998 From: lorincz at iserv.iki.kfki.hu (Andras Lorincz) Date: Mon, 31 Aug 1998 12:55:44 +0200 (MET) Subject: Hippocampus and independent component analysis Message-ID: I would like to announce the availability of a new paper on the functional model of the hippocampus. The model fits smoothly the overall anatomical structure and its two-phase operational mode. The model is built on two basic postulates: (1) the entorhinal-hippocampal loop serves as a control loop with control errors initiating plastic changes in the hippocampus and (2) the hippocampal output develops independent components in the entorhinal cortex The paper AUTHOR: Andras Lorincz TITLE: Forming independent components via temporal locking of reconstruction architectures: a functional model of the hippocampus JOURNAL: Biological Cybernetics (in press) may be obtained from: http://iserv.iki.kfki.hu/New/pub.html Abstract: The assumption is made that the formulation of relations as independent components (IC) is a main feature of computations accomplished by the brain. Further, it is assumed that memory traces made of non-orthonormal ICs make use of feedback architectures to form internal representations. Feedback then leads to delays and delays in cortical processing form an obstacle to this relational processing. The problem of delay compensation is formulated as a speed-field tracking task and is solved by a novel control architecture. It is shown that in addition to delay compensation the control architecture can also shape long term memories to hold independent components if a two-phase operation mode is assumed. Features such as a trisynaptic loop and a recurrent collateral structure at the second stage of that loop emerge in a natural fashion. Based on these properties a functional model of the hippocampal loop is constructed. Andras Lorincz Department of Information Systems Eotvos Lorand University, Budapest, Hungary From rafal at idsia.ch Mon Aug 31 07:51:44 1998 From: rafal at idsia.ch (Rafal Salustowicz) Date: Mon, 31 Aug 1998 13:51:44 +0200 (MET DST) Subject: Hierarchical Probabilistic Incremental Program Evolution Message-ID: H-PIPE: FACILITATING HIERARCHICAL PROGRAM EVOLUTION THROUGH SKIP NODES Rafal Salustowicz Juergen Schmidhuber Technical Report IDSIA-8-98, IDSIA, Switzerland To evolve structured programs we introduce H-PIPE, a hierarchical extension of Probabilistic Incremental Program Evolution (PIPE). Structure is induced by "hierarchical instructions" (HIs) limited to top-level, structuring program parts. "Skip nodes" (SNs) inspired by biology's introns (non-coding segments) allow for switching program parts on and off. In our experiments H-PIPE out- performs PIPE, and SNs facilitate synthesis of certain structured programs but not unstructured ones. We conclude that introns can be particularly useful in the presence of structural bias. ftp://ftp.idsia.ch/pub/rafal/TR-8-98-H-PIPE.ps.gz http://www.idsia.ch/~rafal/research.html Short version: Evolving Structured Programs with Hierarchical Instructions and Skip Nodes. In J. Shavlik, ed., Machine Learning: Proceedings of the Fifteenth International Conference (ICML'98), pages 488-496, Morgan Kaufmann Publishers, San Francisco, 1998. ftp://ftp.idsia.ch/pub/rafal/ICML98_H-PIPE.ps.gz Rafal & Juergen, IDSIA www.idsia.ch From geoff at giccs.georgetown.edu Mon Aug 31 11:18:11 1998 From: geoff at giccs.georgetown.edu (Geoff Goodhill) Date: Mon, 31 Aug 1998 11:18:11 -0400 Subject: Postdoc position available Message-ID: <199808311518.LAA09521@fathead.giccs.georgetown.edu> POSTDOC IN DEVELOPMENTAL NEUROSCIENCE Georgetown Institute for Cognitive and Computational Sciences Georgetown University Washington DC A postdoctoral position is available immediately in the lab of Dr Geoff Goodhill to develop a novel experimental assay for the quantitative characterization of axon guidance mechanisms. This project is a collaboration with Dr Jeff Urbach (Physics, Georgetown) and Dr Linda Richards (Anatomy and Neurobiology, University of Maryland at Baltimore). An interest in neural development and a strong background in tissue culture techniques is required. Interest in theoretical models is a plus (see TINS, 21, 226-231). More information about the lab can be found at http://www.giccs.georgetown.edu/labs/cns Applicants should send a CV, a letter of interest, and names and addresses (including email) of at least two referees to: Dr Geoffrey J. Goodhill Georgetown Institute for Cognitive and Computational Sciences Georgetown University Medical Center 3970 Reservoir Road Washington DC 20007 Tel: (202) 687 6889 Fax: (202) 687 0617 Email: geoff at giccs.georgetown.edu From mac+ at andrew.cmu.edu Mon Aug 31 14:26:14 1998 From: mac+ at andrew.cmu.edu (Mary Anne Cowden) Date: Mon, 31 Aug 1998 14:26:14 -0400 (EDT) Subject: Carnegie Symposium on Mechanisms of Cognitive Development, Oct 9-11, 1998 Message-ID: =============================================================== CALL FOR PARTICIPATION The 29th Carnegie Symposium on Cognition Mechanisms of Cognitive Development: Behavioral and Neural Perspectives October 9 - 11, 1998 James L. McClelland and Robert S. Siegler, Organizers ---------------------------------------------------------------------------- The 29th Carnegie Symposium on Cognition is sponsored by the Department of Psychology and the Center for the Neural Basis of Cognition. The symposium is supported by the National Science Foundation, the National Institute of Mental Heatlh, and the National Institute of Child Health and Human Development. ---------------------------------------------------------------------------- This post contains the following entries relevant to the symposium: * Overview * Schedule of Events * Attending the Symposium * Travel Fellowships ---------------------------------------------------------------------------- Overview This symposium will consider how children's thinking evolves during development, with a focus on the role of experience in causing change. Speakers will examine the processes by which children learn and those that make children ready and able to learn at particular points in development, using both behavioral and neural approaches. Behavioral approaches will include research on the 'microgenesis' of cognitive change over short time periods (e.g., several hour-long sessions) in specific task situations. Research on cognitive change over longer time scales (months and years) will also be presented, as will research that uses computational modeling and dynamical systems approaches to understand learning and development. Neural approaches will include the study of how neuronal activity and connectivity change during acquisition of cognitive skills in children and adults. Other studies will consider the possible emergence of cognitive abilities through the maturation of brain structures and the effects of experience on the organization of functions in the brain. Developmental anomalies such as autism and attention deficit disorder will also be examined, as windows on normal development. Four questions will be examined throughout the symposium: 1) Why do cognitive abilities emerge when they do during development? 2) What are the sources of developmental and individual differences, and of developmental anomalies in learning? 3) What happens in the brain when people learn? 4) How can experiences be ordered and timed so as to optimize learning? The answers to these questions have strong implications for how we educate children and remediate deficits that impede development of thinking abilities. These implications will be explored in discussions among the participants. ---------------------------------------------------------------------------- The 29th Carnegie Symposium on Cognition: Schedule ---------------------------------------------------------------------------- Friday, October 9th: Studies of the Microgenesis of Cognitive Development 8:30 - 9:00 Continental Breakfast 9:00 Welcome BEHAVIORAL APPROACHES 9:20 Susan Goldin-Meadow, University of Chicago Giving the mind a hand: The role of gesture in cognitive change 10:20 Break 10:40 Robert Siegler, Carnegie Mellon University Microgenetic studies of learning in children and in brain-damaged adults 11:40 Lunch NEUROSCIENCE APPROACHES 1:00 Michael Merzenich, University of California, San Francisco Cortical plasticity phenomenology and mechanisms: Implications for neurorehabilitation 2:00 James L. McClelland, Carnegie Mellon University/CNBC Revisiting the critical period: Interventions that enhance adaptation to non-native phonological contrasts in Japanese adults 3:00 Break 3:20 Richard Haier, University of California, Irvine PET studies of learning and individual differences 4:20 Discussant: James Stigler, UCLA Saturday, October 10th: Studies of Change Over Long Time Scales 8:30 - 9:00 Continental Breakfast BEHAVIORAL APPROACHES 9:00 Esther Thelen, Indiana University Dynamic mechanisms of change in early perceptual motor development 10:00 Robbie Case, University of Toronto Differentiation and integration as the mechanisms in cognitive and neurological development 11:00 Break 11:20 Deanna Kuhn, Teacher's College, Columbia University Why development does (and doesn't) occur: Evidence from the domain of inductive reasoning 12:20 Lunch NEUROSCIENCE APPROACHES 2:00 Mark Johnson, Birkbeck College/University College London Cortical specialization for cognitive functions 3:00 Helen Neville, University of Oregon Specificity and plasticity in human brain development 4:00 Break 4:20 Discussant: David Klahr, Carnegie Mellon University Sunday, October 11th: Developmental Disorders 8:30 - 9:00 Continental Breakfast DYSLEXIA 9:00 Albert Galaburda, Harvard Medical School Toxicity of neural plasticity as seen through a model of learning disability AUTISM 10:00 Patricia Carpenter, Marcel Just, Carnegie Mellon University Cognitive load distribution in normal and autistic individuals 11:00 Break ATTENTION DEFICIT DISORDER 11:20 B. J. Casey, University of Pittsburgh Medical Center Disruption and inhibitory control in developmental disorders: A mechanistic model of implicated frontostriatal circuitry 12:20 Concluding discussant: Michael I. Posner, University of Oregon ---------------------------------------------------------------------------- Attending the Symposium Sessions on Friday, October 9 will be held in McConomy Auditorium, University Center, Carnegie Mellon. Sessions on Saturday, October 10 and Sunday, October 11 will be held in the Adamson Wing, Room 135 Baker Hall. Admission is free, and everyone is welcome to attend. Out of town visitors can contact Mary Anne Cowden, (412) 268-3151, mac+ at cmu.edu, for additional information. --------------------------------------------------------------------------- This material is based on the symposium web-page: http://www.cnbc.cmu.edu/carnegie-symposium ---------------------------------------------------------------------------- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mary Anne Cowden, Baker Hall 346 C Psychology Dept, Carnegie Mellon University 5000 Forbes Ave., Pittsburgh, PA 15213 Phone: 412/268-3151 Fax: 412/268-3464 http://www.contrib.andrew.cmu.edu/~mac/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From zhuh at santafe.edu Mon Aug 31 16:01:23 1998 From: zhuh at santafe.edu (Huaiyu Zhu) Date: Mon, 31 Aug 1998 14:01:23 -0600 (MDT) Subject: Error decomposition and model complexity. In-Reply-To: Message-ID: The following paper has been submitted to Neural Computation: http://www.santafe.edu/~zhuh/draft/edmc.ps.gz Error Decomposition and Model Complexity Huaiyu Zhu Bayesian information geometry provides a general error decomposition theorem for arbitrary statistical models and a family of information deviations that include Kullback-Leibler information as a special case. When applied to Gaussian measures it takes the classical Hilbert space (Sobolev space) theories for estimation (regression, filtering, approximation, smoothing) as a special case. When the statistical and computational models are properly distinguished, the dilemmas of over-fitting and ``curse of dimensionality'' disappears, and the optimal model order disregarding computing cost is always infinity. Cited papers that have not appeared in print can be obtained through the web page below. -- Huaiyu Zhu Tel: 1 505 984 8800 ext 305 Santa Fe Institute Fax: 1 505 982 0565 1399 Hyde Park Road mailto:zhuh at santafe.edu Santa Fe, NM 87501 http://www.santafe.edu/~zhuh/ USA ftp://ftp.santafe.edu/pub/zhuh/ From becker at curie.psychology.mcmaster.ca Mon Aug 31 16:25:37 1998 From: becker at curie.psychology.mcmaster.ca (Sue Becker) Date: Mon, 31 Aug 1998 16:25:37 -0400 (EDT) Subject: Calls for Participation: NIPS*98 Workshops Message-ID: Dear Connectionists, Below are brief annoucements of the 20 NIPS*98 workshops taking place in Breckenridge, Colorado on December 4-5 following the main conference in Denver. Many of these have published web pages with further details. See http://www.cs.cmu.edu/Groups/NIPS/1998/Workshops.html and the URLs listed below. Rich Zemel and Sue Becker, NIPS*98 Workshops Co-chairs ---------------------------------------------------------------------- DYNAMICS IN NETWORKS OF SPIKING NEURONS http://diwww.epfl.ch/w3lami/team/gerstner/NIPS_works.html Organizer: W. Gerstner (Lausanne, Switzerland) Networks of spiking neurons have several interesting dynamic properties, for example very rapid and characteristic transients, synchronous firing and asynchronous states. A better understanding of typical phenomena has important implications for problems associated with neuronal coding (spikes or rates). For example, the population activity is a rate-type quantity, but does not need temporal averaging - which suggests fast rate coding as a potential strategy. The idea of the workshop is to start from mathematical models of network dynamics, see what is known in terms of results, and then try to find out what the implications for 'coding' in the most general sense could be. ---------------------------------------------------------------------- POPULATION CODING Organizers: Glen D. Brown, The Salk Institute Kechen Zhang, The Salk Institute We will explore experimental approaches to population coding in three parts. First, we will examine techniques for recording from populations of neurons including electrode arrays and optical methods. Next, we will discuss spike-sorting and other issues in data analysis. Finally, we will examine strategies for interpreting population data, including population recordings from the hippocampus. To facilitate discussion, we are establishing a data base of neuronal-population recordings that will be available for analysis and interpretation. For more information, please contact Glen Brown (glen at salk.edu) or Kechen Zhang (zhang at salk.edu) Computational Neurobiology Laboratory The Salk Institute for Biological Studies 10010 North Torrey Pines Road La Jolla, CA 92037 ---------------------------------------------------------------------- TEMPORAL CODING: IS THERE EVIDENCE FOR IT AND WHAT IS ITS FUNCTION? http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/hatsopoulos.html Organizers: Nicho Hatsopoulos and Harel Shouval Brown University Departments of Neuroscience and Physics One of the most fundamental issues in neuroscience concerns the exact nature of neural coding or representation. The standard view is that information is represented in the firing rates of single or populations of neurons. Recently, a growing body of research has provided evidence for coding strategies based on more precise temporal relationships among spikes. These are some of the questions that the workshop intends to address: 1. What do we mean by temporal coding? What time resolution constitutes a temporal code? 2. What evidence is there for temporal coding in the nervous system. 3. What functional role does it play? What computational problem can it solve that firing rate cannot? 4. Is it feasible to implement given the properties of neurons and their interactions? We intend to organize it as a debate with formal presentations and informal discussion with some of the major figures in the field. Different views regarding this subject will be presented. We will invite speakers doing work in a variety of areas including both vertebrate and invertebrate systems. ---------------------------------------------------------------------- OPTICAL IMAGING OF THE VISUAL CORTEX http://camelot.mssm.edu/~udi Organizers: Ehud Kaplan, Gary Blasdel It is clear that any attempt to model brain function or development will require access to data about the spatio-temporal distribution of activity in the brain. Optical imaging of the brain provides a unique opportunity to obtain such maps, and thus is essential for scientists who are interested in theoretical approaches to neuroscience. In addition, contact of biologists with theoretical approaches could help them focus their studies on the essential theoretical questions, and on new computation, mathematical, or theoretical tools and techniques. We therefore organized a 6-hour workshop on optical imaging of the cortex, to deal with both technical issues and physiological results. The workshop will have the format of a mini-symposium, and will be chaired by Ehud Kaplan (Mt. Sinai School of Medicine) and Gary Blasdel (Harvard). Technical issues to be discussed include: 1. What is the best way to extract faint images from the noisy data? 2. How does one compare/relate functional maps? 3. What is the best wavelength for reflectance measurements? 4. What is the needed (or possible) spatial resolution? 5. How do you deal with brain movement and other artifacts? See also: http://camelot.mssm.edu/~udi ---------------------------------------------------------------------- OLFACTORY CODING: MYTHS, MODELS AND DATA http://www.wjh.harvard.edu/~linster/nips98.html Organizers: Christane Linster, Frank Grasso and Wayne Getz Currently, two main models of olfactory coding are competing with each other: (1) the selective receptor, labeled line model whish has been popularized by recent results from molecular biology, and (2), the non-selective receptor, distributive coding model, supported mainly by data from electrophysiology and imaging in the olfactory bulbs. In this workshop, we will discuss experimental evidence for each model. Theorticians and experimentalists together will discuss the implications for olfactory codoing and for neural porprocessing in the olfactory bulb and cortex for each of the two predominant, and possibly, intermediate, models. ---------------------------------------------------------------------- STATISTICAL THEORIES OF CORTICAL FUNCTION http://www.cnl.salk.edu/~rao/workshop.html Organizers: Rajesh P.N. Rao, Salk Institute (rao at salk.edu) Bruno A. Olshausen, UC Davis (bruno at redwood.ucdavis.edu) Michael S. Lewicki, Salk Institute (lewicki at salk.edu) Participants are invited to attend a post-NIPS workshop on theories of cortical function based on well-defined statistical principles such as maximum likelihood and Bayesian estimation. Topics that are expected to be addressed include: statistical interpretations of the function of lateral and cortico-cortical feedback connections, theories of perception and neural representations in the cortex, and development of cortical receptive field properties from natural signals. For further details, see: http://www.cnl.salk.edu/~rao/workshop.html ---------------------------------------------------------------------- LEARNING FROM AMBIGUOUS AND COMPLEX EXAMPLES Organizers: Oded Maron, PHZ Capital Partners Thomas Dietterich, Oregon State University Frameworks such as supervised learning, unsupervised learning, and reinforcement learning have many established algorithms and theoretical tools to analyze them. However, there are many learning problems that do not fall into any of these established frameworks. Specifically, situations where the examples are ambiguously labeled or cannot be simply represented as a feature vector tend to be difficult for these frameworks. This workshop will bring together researchers who are interested in learning from ambiguous and complex examples. The workshop will include, but not be limited to, discussions of Multiple-Instance Learning, TDNN, bounded inconsistency, and other frameworks for learning in unusual situations. ---------------------------------------------------------------------- TURNKEY ALGORITHMS FOR IMPROVING GENERALIZERS http://ic.arc.nasa.gov/ic/people/kagan/nips98.html Organizers: Kagan Tumer and David Wolpert NASA Ames Research Center Abstract: Methods for improving generalizers, such as stacking, bagging, boosting and error correcting output codes (ECOCs) have recently been receiving a lot of attention. We call such techniques "turnkey" techniques. This reflects the fact that they were designed to improve the generalization ability of generic learning algorithms, without detailed knowledge about the inner workings of those learners. Whether one particular turnkey technique is, in general, "better" than all others, and if so under what circumstances, is a hotly debated issue. Furthermore, it isn't clear whether it is meaningful to ask that question without specific prior assumptions (e.g., specific domain knowledge). This workshop aims at investigating these issues, building a solid understanding of how and when turnkey techniques help generalization ability, and lay out a road map to where the turnkey methods should go. ---------------------------------------------------------------------- MINING MASSIVE DATABASES: SCALABLE ALGORITHMS FOR DATA MINING http://research.microsoft.com/~fayyad/nips98/ Organizers: Usama Fayyad and Padhraic Smyth With the explosive growth in the number of "data owners", interest in scalable, integrated, data mining tools is reaching new heights. This 1-day workshop aims at bringing together researchers and practitioners from several communities to address topics of mutual interest (and misunderstanding) such as: scaling clustering and prediction to large databases, robust algorithms for high dimensions, mathmatical approaches to mining massive datasets, anytime algorithms, and dealing with discrete, mixed, and multimedia (unstructured) data. The invited talks will be used to drive discussion around the issues raised, common problems, and definitions of research problems that need to be addressed. Important questions include: why the need for integration with databases? why deal with massive data stores? What are most effective ways to scale algorithms? How do we help unsophisticated users visualize the data/models extracted? Contact information: Usama Fayyad (Microsoft Research), Fayyad at microsoft.com, http://research.microsoft.com/~fayyad Padhraic Smyth (U.C. Irvine), Smyth at sifnos.ics.uci.edu, http://www.ics.uci.edu/~smyth/ ---------------------------------------------------------------------- INTEGRATING SUPERVISED AND UNSUPERVISED LEARNING www.cs.cmu.edu/~mccallum/supunsup Organizers: Rich Caruana, Just Research Virginia de Sa, UCSF Andrew McCallum This workshop will debate the relationship between supervised and unsupervised learning. The discussion will run the gamut from examining the view that supervised learning can be performed by unsupervised learning of the joint distribution between the inputs and targets, to discussion of how natural learning systems do supervised learning without explicit labels, to the presentation of practical methods of combining supervised and unsupervised learning by using unsupervised clustering or unlabelled data to augment a labelled corpus. The debate should be fun because some attendees believe supervised learning has clear advantages, while others believe unsupervised learning is the only game worth playing in the long run. More information (including a call for abstracts) can be found at www.cs.cmu.edu/~mccallum/supunsup. ---------------------------------------------------------------------- LEARNING ON RELATIONAL DATA REPRESENTATIONS http://ni.cs.tu-berlin.de/nips98/ Organizers: Thore Graepel, TU Berlin, Germany Ralf Herbrich, TU Berlin, Germany Klaus Obermayer, TU Berlin, Germany Symbolic (structured) data representations such as strings, graphs or logical expressions often provide a more natural basis for learning than vector space representations which are the standard paradigm in connectionism. Symbolic representations are currently subject to an intensive discussion (cf. the recent postings on the connectionist mailing list), which focuses on the question if connectionist models can adequately process symbolic input data. One way of dealing with structured data is to characterize them in relation to each other. To this end a set of data items can be characterized by defining a dissimilarity or distance measure on pairs of data items and to provide learning algorithms with a dissimilarity matrix of a set of training data. Prior knowledge about the data at hand can be incorporated explicitly in the definition of the dissimilarity measure. One can even go as far as trying to learn a distance measure appropriate for the task at hand. This procedure may provide a bridge between the vector space and the "structural" approaches to pattern recognition and should thus be of interest to people from both communities. Additionally, pairwise and other non-vectorial input data occur frequently in empirical sciences and pose new problems for supervised and unsupervised learning techniques. More information can be found at http://ni.cs.tu-berlin.de/nips98/ ------------------------------------------------------------------ SEQUENTIAL INFERENCE AND LEARNING http://svr-www.eng.cam.ac.uk/~jfgf/workshop.html Organizers: Mahesan Niranjan, Cambridge University Engineering Department Arnaud Doucet, Cambridge University Engineering Department Nando de Freitas, Cambridge University Engineering Department Sequential techniques are important in many applications of neural networks involving real-time signal processing, where data arrival is inherently sequential. Furthermore, one might wish to adopt a sequential training strategy to deal with non-stationarity in signals, so that information from the recent past is lent more credence than information from the distant past. Sequential methods also allow us to efficiently compute important model diagnostic tools such as the one-step-ahead prediction densities. The advent of cheap and massive computational power has stimulated many recent advances in this field, including dynamic graphical models, Expectation-Maximisation (EM) inference and learning for dynamical models, dynamic Kalman mixture models and sequential Monte Carlo sampling methods. More importantly, such methods are being applied to a large number of interesting real problems such as computer vision, econometrics, medical prognosis, tracking, communications, blind deconvolution, statistical diagnosis, automatic control and neural network training. _______________________________________________________________________________ ABSTRACTION AND HIERARCHY IN REINFORCEMENT LEARNING http://www-anw.cs.umass.edu/~dprecup/call_for_participation.html Organizers: Tom Dietterich, Oregon State University Leslie Kaelbling, Brown University Ron Parr, Stanford University Doina Precup, University of Massachusetts, Amherst When making everyday decisions, people are able to foresee the consequences of their possible courses of action at multiple levels of abstraction. Recent research in reinforcement learning (RL) has focused on the way in which knowledge about abstract actions and abstract representations can be incorporated into the framework of Markov Decision Processes (MDPs). Several theoretical results and applications suggest that these methods can improve significantly the scalability of reinforcement learning systems by accelerating learning and by promoting sharing and re-use of learned subtasks. This workshop aims to address the following issues in this area: - Task formulation and automated task creation - The degree and complexity of action models - The integration of different abstraction methods - Hidden state issues - Utility and computational efficiency considerations - Multi-layer abstractions - Temporally extended perception - The design of autonomous agents based on hierarchical RL architectures We are looking for volunteers to lead discussions and participate in panels. We will also accept some technical papers for presentations. For more details, please check out the workshop page: http://www-anw.cs.umass.edu/~dprecup/call_for_participation.html ---------------------------------------------------------------------- MOVEMENT PRIMITIVES: BUILDING BLOCKS FOR LEARNING MOTOR CONTROL http://www-slab.usc.edu/events/nips98 Organizers: Stefan Schaal (USC/ERATO(JST)) and Steve DeWeerth (GaTech) Traditionally, learning control has been dominated by representations that generate low level actions in response to some measured state information. The learning of appropriate trajectory plans or control policies is usually based on optimization approaches and reinforcement learning. It is well known that these methods do not scale well to high dimensional control problems, that they are computationally very expensive, that they are not particularly robust to unforeseen perturbations in the environment, and that it is hard to re-use these representations for related movement tasks. In order to make progress towards a better understanding of biology and to create movement systems that can automatically build new representations, it seems to be necessary to develop a framework of how to control and to learn control with movement primitives. This workshop will bring together neuroscientists, roboticists, engineers, and mathematicians to explore how to approach the topic of movement primitives in a principled way. Topics of the workshop include the questions such as: what are appropriate movement primitives, how are primitives learned, how can primitives be inserted into control loops, how are primitives sequenced, how are primitives combined to form new primitives, how is sensory information used to modulate primitives, how primitives primed for a particular task, etc. These topics will be addressed from a hybrid perspective combining biological and artificial movement systems. ---------------------------------------------------------------------- LARGE MARGIN CLASSIFIERS http://svm.first.gmd.de/nips98/ Organizers: Alex J. Smola, Peter Bartlett, Bernhard Schoelkopf, Dale Schuurmans Many pattern classifiers are represented as thresholded real-valued functions, eg: sigmoid neural networks, support vector machines, voting classifiers, and Bayesian schemes. Recent theoretical and experimental results show that such learning algorithms frequently produce classifiers with large margins---where the margin is the amount by which the classifier's prediction is to the correct side of threshold. This has led to the important discovery that there is a connection between large margins and good generalization performance: classifiers that achieve large margins on given training data also tend to perform well on future test data. This workshop aims to provide an overview of recent developments in large margin classifiers (ranging from theoretical results to applications), to explore connections with other methods, and to identify directions for future research. The workshop will consist of four sessions over two days: - Mathematical Programming - Support Vector and Kernel Methods, - Voting Methods (Boosting, Bagging, Arcing, etc), and - Connections with Other Topics (including an organized panel discussion) Further details can be found at http://svm.first.gmd.de/nips98/ ---------------------------------------------------------------------- DEVELOPMENT AND MATURATION IN NATURAL AND ARTIFICIAL STRUCTURES http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/haith.html Organizers: Gary Haith, Computational Sciences, NASA Ames Research Center Jeff Elman, Cognitive Science, UCSD Silvano Colombano, Computational Sciences, NASA Ames Research Center Marshall Haith, Developmental Psychology, University of Denver We believe that an ongoing collaboration between computational work and developmental work could help unravel some of the most difficult issues in each domain. Computational work can address dynamic, hierarchical developmental processes that have been relatively intractable to traditional developmental analysis, and developmental principles and theory can generate insight into the process of building and modeling complex and adaptive computational structures. In hopes of bringing developmental processes and analysis into the neural modeling mainstream, this session will focus developmental modelers and theorists on the task of constructing a set of working questions, issues and approaches. The session will hopefully include researchers studying developmental phenomena across all levels of scale and analysis, with the aim of highlighting both system-specific and general features of development. For more information, contact: Gary Haith, Computational Sciences, NASA Ames Research Center phone #: (650) 604-3049 FAX #: (650) 604-3594 E-mail: haith at ptolemy.arc.nasa.gov Mail: NASA Ames Research Center Mail Stop 269-3 Mountain View, CA 94035-1000 ---------------------------------------------------------------------- HYBRID NEURAL SYMBOLIC INTEGRATION http://osiris.sunderland.ac.uk/~cs0stw/wermter/workshops/nips-workshop.html Organizers: Stefan Wermter, University of Sunderland, UK Ron Sun, University of Alabama, USA In the past it was very controversial whether neural or symbolic approaches alone will be sufficient to provide a general framework for intelligent processing. The motivation for the integration of symbolic and neural models of cognition and intelligent behavior comes from many different sources. From the perspective of cognitive neuroscience, a symbolic interpretation of an artificial neural network architecture is desirable, since the brain has a neuronal structure and the capability to perform symbolic processing. From the perspective of knowledge-based processing, hybrid neural/symbolic representations are advantageous, since different mutually complementary properties can be integrated. However, neural representations show advantages for gradual analog plausibility, learning, robust fault-tolerant processing, and generalization to similar input. Areas of interest include: Integration of symbolic and neural techniques for language and speech processing, reasoning and inferencing, data mining, integration for vision, language, multimedia; combining fuzzy/neuro techniques in engineering; exploratory research in emergent symbolic behavior based on neural networks, interpretation and explanation of neural networks, knowledge extraction from neural networks, interacting knowledge representations, dynamic systems and recurrent networks, evolutionary techniques for cognitive tasks (language, reasoning, etc), autonomous learning systems for cognitive agents that utilize both neural and symbolic learning techniques. For more information please see http://osiris.sunderland.ac.uk/~cs0stw/wermter/workshops/nips-workshop.html Workshop contact person: Professor Stefan Wermter Research Chair in Intelligent Systems University of Sunderland School of Computing & Information Systems St Peters Way Sunderland SR6 0DD United Kingdom phone: +44 191 515 3279 fax: +44 191 515 2781 email: stefan.wermter at sunderland.ac.uk http://osiris.sunderland.ac.uk/~cs0stw/ ---------------------------------------------------------------------- SIMPLE INFERENCE HEURISTICS VS. COMPLEX DECISION MACHINES http://www.cs.cmu.edu/Groups/NIPS/1998/Workshop-CFParticipation/todd.html Organizers: Peter M. Todd, Laura Martignon, Kathryn Blackmond Laskey Participants and presentations are invited for this post-NIPS workshop on the contrast in both psychology and machine learning between a probabilistically- defined view of rational decision making with its apparent demand for complex Bayesian models, and a more performance-based view of rationality built on the use of simple, fast and frugal decision heuristics. ---------------------------------------------------------------------- CONTINUOUS LEARNING http://www.forwiss.uni-erlangen.de/aknn/cont-learn/ Organizers: Peter Protzel, Lars Kindermann, Achim Lewandowski, and Michael Tagscherer FORWISS and Chemnitz University of Technology, Germany By continuous learning we mean that learning takes place all the time and is not interrupted, that there is no difference between periods of training and operation, and that learning AND operation start with the first pattern. In this workshop, we will especially focus on the approximation of non-linear, time-varying functions. The goal is modeling and adapting the model to follow the changes of the underlying process, not merely forecasting the next output. In order to facilitate the comparison of the various methods, we provide different benchmark data sets and participants are encouraged to discuss their results on these benchmarks during the workshop. Further information: http://www.forwiss.uni-erlangen.de/aknn/cont-learn/ ---------------------------------------------------------------------- LEARNING CHIPS AND NEUROBOTS http://bach.ece.jhu.edu/nips98 Organizers: Gert Cauwenberghs, Johns Hopkins University Ralph Etienne-Cummings, Johns Hopkins University Marwan Jabri, Sydney University This workshop aims at a better understanding of how different approaches to learning and sensorimotor control, including algorithms and hardware, from backgrounds in neuromorphic VLSI, robotics, neural nets, AI, genetic programming etc. can be combined to create more intelligent systems interacting with their environment. We encourage active participation, and welcome live demonstrations of systems. The panel has a representation over a wide range of disciplines. Machine learning approaches include: reinforcement learning, TD-lambda (or predictive hebbian learning), Q-learning, and classical as well as operand conditioning. VLSI implementations cover some of these, integrated on-chip, plus the sensory and motor interfaces. Evolutionary approaches cover genetic techniques, applied to populations of robots. Finally, we have designers of microrobots and walking robots on the panel. This list is by no means exhaustive! More information can be found at URL: http://bach.ece.jhu.edu/nips98 __________________________________________________________________________