From lwc%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK Tue Oct 4 12:44:44 1988 From: lwc%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK (Laiwan Chan) Date: Tue, 4 Oct 88 12:44:44 BST Subject: Roommate for NIPS*88 in Denver Message-ID: <20404.8810041144@dsl.eng.cam.ac.uk> There is another poor student looking for someone who is willing to share a hotel room in the NIPS conference in Denver. I will attend the conference and the post workshop, and would like to find a lady to share a hotel room and the cost with me during that period. Alternative suggestions for cheaper hotels/dormitories/shelters etc nearby are very welcome. For your convenience, I am a Ph.D. student working in Engineering Department of Cambridge University. Thanks in Advance, Lai-Wan CHAN, (Miss), Engineering Dept., Trumpington Street, Cambridge, CB2 1PZ, England. email : lwc at uk.ac.cam.eng.dsl%uk.ac.rl.earn From pratt at paul.rutgers.edu Tue Oct 4 13:59:07 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 4 Oct 88 13:59:07 EDT Subject: Josh Alspector to speak at Rutgers on neural network learning chip Message-ID: <8810041759.AA12607@zztop.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers Electronic Models of Neuromorphic Networks ------------------------------------------ Joshua Alspector Bellcore, Morristown, NJ 07960 Room 705 Hill center, Busch Campus Piscataway, NJ Friday October 21, 1988 at 11:10 am Refreshments served before the talk Abstract We describe how current models of computation in the brain can be physically implemented using VLSI technology. This includes modeling of sensory processes, memory, and learning. We have fabricated a test chip in 2 micron CMOS that can perform supervised learning in a manner similar to the Boltzmann machine. The chip learns to solve the XOR problem in a few milliseconds. Patterns can be presented to it at 100,000 per second. We also have demonstrated the capability to do unsupervised competitive learning as well as supervised learning. From fortes at ee.ecn.purdue.edu Tue Oct 4 15:22:22 1988 From: fortes at ee.ecn.purdue.edu (Jose A Fortes) Date: Tue, 4 Oct 88 14:22:22 EST Subject: Hector Sussmann to speak on formal analysis of Boltzmann Machine Learning Message-ID: <8810041922.AA19227@ee.ecn.purdue.edu> To:sussmann at math.rutgers.edu, fortes From:fortes at ee.ecn.purdue.edu Subject:technical information Could you please forward to me any technical papers/reports that discuss the work that you intend to report on at the seminar mentioned below. My address is Jose A.B. Fortes Assistant Professor School of Electrical Engineering Purdue University W. Lafayette, IN 47907 Thanks a lot. J. Fortes +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Fall, 1988 Neural Networks Colloquium Series at Rutgers On the theory of Boltzmann Machine Learning ------------------------------------------- Hector Sussmann Rutgers University Mathematics Department Room 705 Hill center, Busch Campus Friday October 14, 1988 at 11:10 am Refreshments served before the talk Abstract The Boltzmann machine is an algorithm for learning in neural networks, involving alternation between a ``learning'' and ``hallucinating'' phase. In this talk, I will present a Boltzmann machine algorithm for which it can be proven that, for suitable choices of the parameters, the weights converge so that the Boltzmann machine correctly classifies all training data. This is because the evolution of the weights follow very closely, with very high probability, an integral trajectory of the gradient of the likelihood function whose global maxima are exactly the desired weight patterns. From terry Thu Oct 6 13:26:55 1988 From: terry (Terry Sejnowski ) Date: Thu, 6 Oct 88 13:26:55 edt Subject: Neural Computation Message-ID: <8810061726.AA04156@crabcake.cs.jhu.edu> Neural Computation, a new journal published by the MIT Press, is on schedule for its first issue in February, 1989. We have about 20 very high quality papers in review, with many exciting new results. The first issue will also contain a long review on neural networks for speech recognition by Richard Lippmann --- send him papers that should be considered: Richard Lippmann B-349 MIT Lincoln Laboratory Lexington, MA 02173 Please note that all research communications should be short -- up to 2000 words and 4 figures. These are intended as announcements of new and important results that merit a wide audience. Note that comparable journals in physics and biology, such as Physical Review Letters and Nature, are among the most prestigious research journals. Unlike the short research communications that appear in some specialty journals, papers in Neural Computation will have high visibility and wide circulation. Publication will be fast --- 3 months turnaround will be guaranteed from the day of acceptance for the first year. Terry Sejnowski ----- From terry at cs.jhu.edu Thu Oct 6 17:03:55 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Thu, 6 Oct 88 17:03:55 edt Subject: Neural Computation Subscriptions Message-ID: <8810062103.AA06559@crabcake.cs.jhu.edu> Subscriptions to Neural Computation are available for: $45.00 Individual $90.00 Institution (add $9.00 surface mail or $17.00 postage for outside US and Canada) Available from: MIT Press Journals 55 Hayward Street Cambridge, MA 02142 (617) 253 2889 for credit card orders Terry ----- From skrzypek at CS.UCLA.EDU Thu Oct 6 20:12:10 1988 From: skrzypek at CS.UCLA.EDU (Dr Josef Skrzypek) Date: Thu, 6 Oct 88 17:12:10 PDT Subject: Technical Report In-Reply-To: 's message of 5-OCT-1988 14:49:48.60 <8810060056.AA27357@hera.cs.ucla.edu> Message-ID: <8810070012.AA01808@lanai.cs.ucla.edu> Dear James, Could you please send me one copy of your TR "Representing Simple Arithmetic in Neural Networks" Thank you Josef Prof. Josef Skrzypek SKRZYPEK at cs.ucla.edu Department of Computer Science 3532 BH UCLA Los Angeles CA 90024 From prlb2!welleken at uunet.UU.NET Sat Oct 8 13:00:24 1988 From: prlb2!welleken at uunet.UU.NET (Wellekens) Date: Sat, 8 Oct 88 18:00:24 +0100 Subject: report available Message-ID: <8810081700.AA20933@prlb2.UUCP> The following report is available free of charge from Chris.J.Wellekens, Philips Research Laboratory Brussels, 2 Avenue van Becelaere, B-1170 Brussels,Belgium. Email wlk at prlb2.uucp LINKS BETWEEN MARKOV MODELS AND MULTILAYER PERCEPTRONS H.Bourlard and C.J.Wellekens Philips Research Laboratory Brussels ABSTRACT Hidden Markov models are widely used for automatic speech recognition. They inherently incorporate the sequential character of the speech signal and are statistically trained. However, the a priori choice of a model topology limits the flexibility of the HMM's. Another drawback of these models is their weak discriminating power. Multilayer perceptrons are now promising tools in the connectionist approach for classification problems and have already been successfully tested on speech recognition problems. However, the sequential nature of the speech signal remains difficult to handle in that kind of machine. In this paper, a discriminant hidden Markov model is defined and it is shown how a particular multilayer perceptron with contextual and extra feedback input units can be considered as a general form of such Markov models. Relations with other recurrent networks commonly used in speech recognition are also pointed out. Chris From niranjan%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK Mon Oct 10 11:59:27 1988 From: niranjan%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK (M. Niranjan) Date: Mon, 10 Oct 88 11:59:27 BST Subject: Abstract Message-ID: <15317.8810101059@dsl.eng.cam.ac.uk> Here is an extended summary of a Tech report now available. Apologies for the incomplete de-TeXing. niranjan PS: Remember, reprint requests should be sent to "niranjan at dsl.eng.cam.ac.uk" and NOT "connectionists at q.cs.cmu.edu" ============================================================================= NEURAL NETWORKS AND RADIAL BASIS FUNCTIONS IN CLASSIFYING STATIC SPEECH PATTERNS Mahesan Niranjan & Frank Fallside CUED/F-INFENG/TR 22 University Engineering Department Cambridge, CB2 1PZ, England Email: niranjan at dsl.eng.cam.ac.uk SUMMARY This report compares the performances of three non-linear pattern classifiers in the recognition of static speech patterns. Two of these classifiers are neural networks (Multi-layered perceptron and the modified Kanerva model (Prager & Fallside, 1988)). The third is the method of radial basis functions (Broomhead & Lowe, 1988). The high performance of neural-network based pattern classifiers shows that simple linear classifiers are inadequate to deal with complex patterns such as speech. The Multi-layered perceptron (MLP) gives a mechanism to approximate an arbitrary classification boundary (in the feature space) to a desired precision. Due to this power and the existence of a simple learning algorithm (error back-propagation), this technique is in very wide use nowadays. The modified Kanerva model (MKM) for pattern classification is derived from a model of human memory (Kanerva, 1984). It attempts to take advantage of certain mathematical properties of binary spaces of large dimensionality. The modified Kanerva model works with real valued inputs. It compares an input feature vector with a large number of randomly populated `location cells' in the input feature space; associated with every cell is a `radius'. Upon comparison, the cell outputs value 1 if the input vector lies within a volume defined by the radius; its output is zero otherwise. The discrimi- nant function of the Modified Kanerva classifier is a weighted sum of these location-cell outputs. It is trained by a gradient descent algorithm. The method of radial basis functions (RBF) is a technique for non-linear discrimination. RBFs have been used by Powell (1985) in multi-variable interpolation. The non-linear discriminant function in this method is of the form, g( x) = sum_j=1^m lambda_j phi (||x - x_j||) Here, x is the feature vector. lambda_j are weights associated with each of the given training examples x_j. phi is a kernel function defining the range of influence of each data point on the class boundary. For a particular choice of the phi function, and a set of training data {x_j,f_j}, j=1,...,N, the solution for the lambda_j s is closed-form. Thus this technique is computationally simpler than most neural networks. When used as a non- parametric technique, each computation at classification stage involves the use of all the training examples. This, however, is not a disadvantage since much of this computing can be done in parallel. In this report, we compare the performance of these classifiers on speech signals. Several techniques similar to the method of radial basis functions are reviewed. The properties of the class boundaries generated by the MLP, MKM and RBF are derived on simple two dimensional examples and an experimental comparison with speech data is given. ============================================================================ From terry Tue Oct 11 21:40:56 1988 From: terry (Terry Sejnowski ) Date: Tue, 11 Oct 88 21:40:56 edt Subject: NIPS Early Registration is October 15 Message-ID: <8810120140.AA19961@crabcake.cs.jhu.edu> ***** Note: Deadline for early registration discounts is October 15 ***** Mail in registration form below: IEEE Conference on "NEURAL INFORMATION PROCESSING SYSTEMS - NATURAL AND SYNTHETIC" November 28 - December 1, 1988 (Mon-Thurs), Sheraton Denver Tech Center, Denver, Colorado with a Post Meeting Workshop, December 1-4 Keystone Resort, Colorado The program stresses interdisciplinary interactions. All papers have been thoroughly refereed. Plenary lectures will bridge the gap between engineering and biology. -------------------------------------------------------------------- REGISTRATION FORM: NAME: Last First Middle Initial Business or Academic Title Professional Affiliation Street Address and Internal Mail Code City State Zip Country (if not U.S.) Telephone FEES: (Registration includes Monday reception, Wednesday night banquet and 3 Continental breakfasts.) Conference: Early (before Oct. 15, 1988) $ 175 Late (after Oct. 15, 1988) $ 225 Early Full-time students, with I.D. $ 35 Late Full-time students $ 50 Registration includes the welcoming reception Monday night, the banquet Wednesday night, and Continental breakfast all three days. Registration for the post meeting workshop is separate. Post Meeting Workshop (Deadline Oct. 15): Post-meeting workshop $ 75 Post-meeting workshop, students $ 60 Enclosed is a check or money order in U.S. dollars for $________ (Please make check payable to the Neural Information Processing Conference) Financial support may be available for students (see previous page) Please mail form to: Dr. Clifford Lau ONR Code 1114SE 800 North Quincy Street BCT #1 Arlington, Virginia 22217 FINANCIAL SUPPORT: Modest financial support for travel may be available to students, young faculty, and some senior faculty changing fields to work on neural networks. Those requesting support should write a one page summary of their background, research interest, and include a curriculum vitae, and mail to the Chairman, Dr. Terry Sejnowski, Dept. of Biophysics, Johns Hopkins University, Baltimore, MD, 21218. Applicants will be notified of awards (typically $250-500) by November 1. ---------------------------------------------------------------------- PROGRAM HIGHLIGHTS More than 300 papers were submitted to the conference; each was refereed by multiple referees. A number of invited talks will survey active areas of research and lead the way to contributed oral presentations. The following is the currently planned program. Monday, November 28, 1988 8:00 PM: Wine and Cheese Reception, Denver Tech Center Tuesday November 29, 1988 SESSION O1: Learning and Generalization Invited Talk 8:30 O1.1: "Birdsong Learning", Mark Konishi, Division of Biology, California Institute of Technology Contributed Talks 9:10 O1.2: "Comparing Generalization by Humans and Adaptive Networks", M. Pavel, M.A. Gluck, V. Henkle, Department of Psychology, Stanford University 9:40 O1.3: "An Optimality Principle for Unsupervised Learn- ing", T. Sanger, AI Lab, MIT 10:10 Break 10:30 O1.4: "Learning by Example with Hints", Y.S. Abu- Mostafa, California Institute of Technology, Department of Electrical Engineering 11:00 O1.5: "Associative Learning Via Inhibitory Search", D.H. Ackley, Cognitive Science Research Group, Bell Communi- cation Research, Morristown NJ 11:30 O1.6: "Speedy Alternatives to Back Propagation", J. Moody, C. Darken, Computer Science Department, Yale Univer- sity 12:00 Poster Session SESSION O2: Applications Tuesday Afternoon Invited Talk 2:20 O2.1: "Speech Recognition," John Bridle, Royal Radar Establishment, Malvern, U.K. Contributed Talks 3:00 O2.2: "Modularity in Neural Networks for Speech Recog- nition," A. Waibel, ATR Interpreting Telephony Research Lab- oratories, Osaka, Japan 3:30 O2.3: "Applications of Error Back-propagation to Pho- netic Classification," H.C. Leung, V.W. Zue, Department of Electrical Eng. & Computer Science, MIT 4:00 O2.4: "Neural Network Recognizer for Hand-Written Zip Code Digits: Representations,, Algorithms, and Hardware," J.S. Denker, H.P. Graf, L.D. Jackel, R.E. Howard, W. Hubbard, D. Henderson, W.R. Gardner, H.S. Baird, I. Guyon, AT&T Bell Laboratories, Holmdel, NJ 4:30 O2.5: "ALVINN: An Autonomous Land Vehicle in a Neural Network," D.A. Pomerleau, Computer Science Department, Carnegie Mellon University 5:00 O2.6: "A Combined Multiple Neural Network Learning System for the Classification of, Mortgage Insurance Appli- cations and Prediction of Loan Performance," S. Ghosh, E.A. Collins, C. L. Scofield, Nestor Inc., Providence, RI 8:00 PM Poster Session I Wednesday, November 30, 1988 AM SESSION O3: Neurobiology Invited Talk 8:30 O3.1: "Cricket Wind Detection," John Miller, Depart- ment of Zoology, UC Berkeley Contributed Talks 9:10 O3.2: "A Passive, Shared Element Analog Electronic Cochlea," D. Feld, J. Eisenberg, E.R. Lewis, Department of Electrical Eng. & Computer Science, University of California, Berkeley 9:40 O3.3: "Neuronal Maps for Sensory-motor Control in the Barn Owl," C.D. Spence, J.C. Pearson, J.J. Gelfand, R.M. Peterson, W.E. Sullivan, David Sarnoff Research Ctr, Subsid- iary of SRI International, Princeton, NJ 10:10 Break 10:30 O3.4: "Simulating Cat Visual Cortex: Circuitry Under- lying Orientation Selectivity," U.J. Wehmeier, D.C. Van Essen, C. Koch, Division of Biology, California Institute of Technology 11:00 O3.5: Model of Ocular Dominance Column Formation: Ana- lytical and Computational, Results," K.D. Miller, J.B. Keller, M.P. Stryker, Department of Physiology, University of California, San Francisco 11:30 O3.6: "Modeling a Central Pattern Generator in Soft- ware and Hardware:, Tritonia in Sea Moss," S. Ryckebusch, C. Mead, J. M. Bower, Computational Neural Systems Program, Caltech 12:00 Poster Session Wednesday PM SESSION O4: Computational Structures Invited Talk 2:20 O4.1: "Symbol Processing in the Brain," Geoffrey Hinton, Computer Science Department, University of Toronto Contributed Talks 3:00 O4.2: "Towards a Fractal Basis for Artificial Intelli- gence," Jordan Pollack, New Mexico State University, Las Cruces, NM 3:30 O4.3: "Learning Sequential Structure In Simple Recur- rent Networks," D. Servan-Schreiber, A. Cleeremans, J.L. McClelland, Department of Psychology, Carnegie-Mellon Uni- versity 4:00 O4.4: "Short-Term Memory as a Metastable State "Neurolocator," A Model of Attention", V.I. Kryukov, Re- search Computer Center, USSR Academy of Sciences 4:30 O4.5: "Heterogeneous Neural Networks for Adaptive Be- havior in Dynamic Environments," R.D. Beer, H.J. Chiel, L.S. Sterling, Center for Automation and Intelligent Sys. Res., Case Western Reserve University, Cleveland, OH 5:00 O4.6: "A Link Between Markov Models and Multilayer Perceptions," H. Bourlard, C.J. Wellekens, Philips Research Laboratory, Brussels, Belgium 7:00 PM Conference Banquet 9:00 Plenary Speaker "Neural Architecture and Function," Valentino Braitenberg, Max Planck Institut fur Biologische Kybernetik, West Germany Thursday, December 1, 1988 AM SESSION O5: Applications Invited Talk 8:30 O5.1: "Robotics, Modularity, and Learning," Rodney Brooks, AI Lab, MIT Contributed Talks 9:10 O5.2: "The Local Nonlinear Inhibition Circuit," S. Ryckebusch, J. Lazzaro, M. Mahowald, California Institute of Technology, Pasadena, CA 9:40 O5.3: "An Analog Self-Organizing Neural Network Chip," J. Mann, S. Gilbert, Lincoln Laboratory, MIT, Lexington, MA 10:10 Break 10:30 O5.4: "Performance of a Stochastic Learning Micro- chip," J. Alspector, B. Gupta, R.B. Allen, Bellcore, Morristown, NJ 11:00 O5.5: "A Fast, New Synaptic Matrix For Optically Pro- grammed Neural Networks," C.D. Kornfeld, R.C. Frye, C.C. Wong, E.A. Rietman, AT&T Bell Laboratories, Murray Hill, NJ 11:30 O5.6: "Programmable Analog Pulse-Firing Neural Net- works," Alan F. Murray, Lionel Tarassenko, Alister Hamilton, Department of Electrical Engineering, University of Edinburgh Scotland, UK 12:00 Poster Session 3:00 PM Adjourn to Keystone for workshops CONFERENCE PROCEEDING: The collected papers of the conference, called "Advances in Neural Information Processing Systems", Volume 1, will be available starting April 1, 1989. To reserve a copy, contact Morgan Kaufman Publishers, Inc., Order Fulfillment Department, P.O. Box 50490, Palo Alto, CA 94303, or call (415) 965-4081. \f3HOTEL REGISTRATION:\f1 The meeting in Denver is at the Denver Sheraton Hotel. The Sheraton Denver Tech Center is a suburban hotel located in the southeast corridor of Colorado in the exclusive Denver Technological Center Business Park. The property is 18 miles from the Stapleton International Airport, 3 miles from Centennial Airport and easily accessible by taxi or local airport shuttle services. TRANSPORTATION: Air Travel to Denver: Stapleton Airport is one of the major hubs in the U.S. and is served by numerous carriers, including United Airlines which has direct flights or close connections from almost all of its cities to Denver. As the "Official Airline" of the conference, United has pledged to discount the fares for conference attendees to below that offered by any other carrier (and is also making free tickets available for a drawing at the end of the conference). For reservations and further details, call 1-800-521-4041 and refer to account number 405IA. Ground Transport to Denver: (scheduled bus and van) - Shuttle service is available from the Southeast Airporter every 30 minutes at Door 6 on the baggage claim level, one way = $7 and round trip = $10. Car Rental: We have an agreement with Hertz Rental at the Sheraton for $20/day, 150 free miles/day with 30 cents/mile for each additional mile. Refer questions and reservations to Kevin Kline at Hertz (1-800-654-3131). POST MEETING WORKSHOP: December 1 - 4, 1988 Registration for the workshop is separate from the conference. It includes 2 continental breakfasts and one banquet dinner. FORMAT: Small group workshops will be held in the morning (7:30 - 9:30) and afternoon (4:30 - 6:30). The entire group will gather in the evening (8:30 - 10:30 p.m.) to hear the workshop leaders' reports, have further discussion and socialize. Last year this was a very successful format and we will again be open to suggestions from partricipants. Examples of the topics to be covered are Rules and Connectionist Modeling, Advances in Speech Recognition, New Experimental Methods, (especially optical recording with voltage sensitive dyes), Comparison of Hidden Markov vs. Neural Network Models, Complexity Issues, Neural Network vs. Traditional Classifiers, Real Neurons as Compared with Formal Neurons, Applications to Temporal Sequence Problems. Workshop Location: Keystone Resort will be the site of Neural Information Processing Systems workshops. Keystone Mountain is 70 miles west of Denver and offers the finest early season skiing in the Nation. Keystone Resort is a full service resort featuring world class skiing in addition to amenities including 11 swimming pools, indoor tennis, ice skating, 15 restaurants and a Village. Early December at Keystone provides an outstanding skiing value. IEEE has been able to secure to the following lodging rates. Accommodations: "Keystone Lodge": Single - 1 person $ 69.00 Double - 2 people $ 79.00 Condominiums Studio Condominium - 1 to 2 people $105.00 1 Bedroom Condominium - 1 to 4 people $130.00 These rates are per night and do not include 5.2% sales tax or 4.9% local surcharge. Please add $12.00 per person for persons over the stated levels. Attendance will be limited, and rooms not reserved by October 15th will be released back to the Keystone Resort. Keystone will extend discounted group lift tickets to IEEE workshop attendees for $17 per day. In addition, Keystone offers Night Skiing on over 200 acres of lighted terrain for $9 when extending your day ticket. Accommodations at Keystone may be reserved by calling 1-800-222-0188. When making your reservation, please refer to the group code DZ0IEEE to obtain the special conference rates. Transportation is available from the meeting site (the Sheraton Denver Tech Center) to Keystone and then from Keystone to Denver Stapleton International Airport, for $23 one way, or $46 round trip. Reservations can be made by completing the reservation form or by calling Keystone Resort at 1-800-451-5930. Hertz Rental cars are available at the Sheraton Denver Tech Center. Weekend rates are available. To reserve a Hertz rental car, call 1-800-654-3131. For those driving to Keystone, follow I-70 west to Exit 205, then take Highway 6 five miles west to Keystone. Word about Skiing at Summit County in December Keystone is joined by three other ski areas to compose Ski The Summit, Arapahoe Basic, Breckenridge, and Copper Mountain. All the areas are within 1/2 hour of each other and can be accessed by using the complimentary Summit Stage. Keystone features the largest snowmaking system in the West and traditionally has mid season conditions by December 1st. A Keystone lift ticket offers access to the three mountains; Keystone, the expert challenge of North Peak, and the Legend Arapahoe Basin. Keystone offers complete resort facilities including ski rental, ski school, nursery and mountain restaurants. Early December in Colorado is known as time for good skiing and minimal lift lines. From ajr at DSL.ENG.CAM.AC.UK Wed Oct 12 11:29:55 1988 From: ajr at DSL.ENG.CAM.AC.UK (Tony Robinson) Date: Wed, 12 Oct 88 11:29:55 BST Subject: Abstract Message-ID: <5917.8810121029@dsl.eng.cam.ac.uk> For those people who did not attend the nEuro'88 connectionists conference in Paris, our contribution is now available, abstract included below. Tony Robinson PS: Remember, reprint requests should be sent to "ajr at dsl.eng.cam.ac.uk" and NOT "connectionists at q.cs.cmu.edu" ============================================================================== A DYNAMIC CONNECTIONIST MODEL FOR PHONEME RECOGNITION A J Robinson, F Fallside Cambridge University Engineering Department Trumpington Street, Cambridge, England ajr at dsl.eng.cam.ac.uk ABSTRACT This paper describes the use of two forms of error propagation net trained to ascribe phoneme labels to successive frames of speech from multiple speakers. The first form places a fixed length window over the speech and labels the central portion of the window. The second form uses a dynamic structure in which the successive frames of speech and state vector containing context information are used to generate the output label. The paper concludes that the dynamic structure gives a higher recognition rate both in comparison with the fixed context structure and with the established k nearest neighbour technique. ============================================================================ From terry at cs.jhu.edu Wed Oct 12 09:48:39 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Wed, 12 Oct 88 09:48:39 edt Subject: NIPS phone numbers Message-ID: <8810121348.AA23558@crabcake.cs.jhu.edu> Rooms should be reserved at the Denver Tech Center (800-552-7030) and at Keystone (800-222-0188). There are a limited number reserved for the meeting at a discount rate on a first-come first-serve basis. A tentative list of topics and leaders for the post- conference workshop at Keystone includes: chair topic H. Gigley Rules and Connectionist Modelling A. Waibel Speech, especially NN's vs HMM's A. Grinvald Imaging techniques in Neurobiology I. Parberry Computational Complexity Issues M. Carter Fault Tolerance in Neural Networks C. Lau Evaluation Criteria for Neural Network Demonstrations If you want to chair a session contact Scott Kirkpatrick, program chairman: KIRK at ibm.com Terry ----- From pratt at paul.rutgers.edu Wed Oct 12 12:16:40 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 12 Oct 88 12:16:40 EDT Subject: Eric Baum to speak on formal results for generalization in neural nets Message-ID: <8810121616.AA21774@paul.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers What Size Net Gives Valid Generalization? ----------------------------------------- Eric B. Baum Jet Propulsion Laboratory California Institute of Technology Pasadena, CA. 91109 Room 705 Hill center, Busch Campus Friday October 28, 1988 at 11:10 am Refreshments served before the talk Abstract We address the question of when a network can be expected to generalize from m random training examples chosen from some arbitrary probability distribution, assuming that future test examples are drawn from the same distribution. Among our results are the following bounds on appropriate sample vs. network size. Assume 0 < e <= 1/8. We show that if m >= O( WlogN/e log(1/e)) examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 - e/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 1 - e of future test examples drawn from the same distribution. Conversely, for fully-connected feedforward nets with one hidden layer, any learning algorithm using fewer than Omega(W/e) random training examples will, for some distributions of examples consistent with an appropriate weight choice, fail at least some fixed fraction of the time to find a weight choice that will correctly classify more than a 1 - e fraction of the future test examples. From martha at cs.utexas.edu Thu Oct 13 12:45:36 1988 From: martha at cs.utexas.edu (Martha Weinberg) Date: Thu, 13 Oct 88 11:45:36 CDT Subject: Neural Computation Message-ID: <8810131645.AA22270@ratliff.cs.utexas.edu> From pratt at paul.rutgers.edu Thu Oct 13 15:23:39 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Thu, 13 Oct 88 15:23:39 EDT Subject: Cary Kornfeld to speak on hardware for neural nets & bitmapped graphics Message-ID: <8810131923.AA24651@zztop.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers Bitmap Graphics and Neural Networks ----------------------------------- Cary Kornfeld AT&T Bell Laboratories Room 705 Hill center, Busch Campus Monday October 31, 1988 at 11:00 AM NOTE DAY AND TIME ARE DIFFERENT FROM USUAL Refreshments served before the talk From the perspective of system architecture and hardware design, bitmap graphics and neural networks are surprisingly alike. I will describe two key components of a graphics processor, designed and fabricated at Xerox PARC, this engine is based on Leo Guiba's Bitmap Calculus. While implementing that machine I got interested in building tiny, experimental flat panel displays. In the second part of this talk, I will describe a few of the early prototypes and (if facilities per- mit), will show a short video clip of their operation. When I arrived at Bell Labs three years ago I began building larger display panels using amorphous silicon, thin film transistors on glass substrates. It was this display work that gave birth to the idea of fabricating large neural networks using light sensitive synaptic elements. In May of this year we demonstrated working prototypes of these arrays in an ex- perimental neuro-computer at the Atlanta COMDEX show. This is one of the first neuro-computers built and is among the largest. Each of its 14,000 synapses is independently programmable over a continuous range of connection strength that can theoretically span more than five orders of magnitude (we've measured about three in our first-generation arrays). The computer has an animated, graphical user interface that en- ables the operator to both monitor and control its operation. This machine is "programmed" to solve a pattern reconstruction problem. (Again, facilities permitting) I will show a video tape of its operation and will demonstrate the user interface on a color SUN 3. From alexis at marzipan.mitre.org Fri Oct 14 10:28:14 1988 From: alexis at marzipan.mitre.org (Alexis Wieland) Date: Fri, 14 Oct 88 10:28:14 EDT Subject: Hopfield Paper ?? Message-ID: <8810141428.AA01379@marzipan.mitre.org.> I am interested in a pointer to a Hopfield (and Tank?) paper which I though existed and haven't been able to find. J.J. Hopfield has spoken about feed-forward nets described by his standard diff.eq.'s for accepting time-varying inputs. As I remember this system basically fed into a "Hopfield Net," but it's the time- varying feed-forward part that I am most interested in. My recollection is that there was an '86 paper on this, and I most recently heard about this at the AAAI Stanford workshop last spring. Any pointers to this or related work will be greatly appreciated. thanks in advance, Alexis Wieland wieland at mitre.arpa <= please use this address, just REPLYing may get confused From pollack at toto.cis.ohio-state.edu Fri Oct 14 15:12:37 1988 From: pollack at toto.cis.ohio-state.edu (Jordan B. Pollack) Date: Fri, 14 Oct 88 15:12:37 EDT Subject: tech report available In-Reply-To: Kathy Farrelly's message of 12 October 1988 1458-PDT (Wednesday) <8810122158.AA17267@sdics.ICS> Message-ID: <8810141912.AA01623@toto.cis.ohio-state.edu> Please and Thank you in advance, for a copy of Williams & Zipser's TR Jordan Pollack CIS Dept. OSU 2036 Neil Ave Columbus, OH 43210 From pollack at toto.cis.ohio-state.edu Fri Oct 14 15:12:37 1988 From: pollack at toto.cis.ohio-state.edu (Jordan B. Pollack) Date: Fri, 14 Oct 88 15:12:37 EDT Subject: tech report available In-Reply-To: Kathy Farrelly's message of 12 October 1988 1458-PDT (Wednesday) <8810122158.AA17267@sdics.ICS> Message-ID: <8810141912.AA01623@toto.cis.ohio-state.edu> Please and Thank you in advance, for a copy of Williams & Zipser's TR Jordan Pollack CIS Dept. OSU 2036 Neil Ave Columbus, OH 43210 From mcvax!bion.kth.se!orjan at uunet.UU.NET Thu Oct 13 04:48:35 1988 From: mcvax!bion.kth.se!orjan at uunet.UU.NET (Orjan Ekeberg) Date: Thu, 13 Oct 88 09:48:35 +0100 Subject: Paper from nEuro'88 in Paris Message-ID: <8810130848.AA23937@bogart.bion.kth.se> The following paper, presented at the nEuro'88 conference in Paris, has been sent for publication in the proceedings. Reprint requests can be sent to orjan at bion.kth.se (no cc to connectionists at ... please). =============== AUTOMATIC GENERATION OF INTERNAL REPRESENTATIONS IN A PROBABILISTIC ARTIFICIAL NEURAL NETWORK Orjan Ekeberg, Anders Lansner Department of Numerical Analysis and Computing Science The Royal Institute of Technology, S-100 44 Stockholm, Sweden ABSTRACT In a one layer feedback perceptron type network, the connections can be viewed as coding the pairwise correlations between activity in the corresponding units. This can then be used to make statistical inference by means of a relaxation technique based on bayesian inferences. When such a network fails, it might be because the regularities are not visible as pairwise correlations. One cure would then be to use a different internal coding where selected higher order correlations are explicitly represented. A method for generating this representation automatically is presented with a special focus on the networks ability to generalize properly. From pratt at paul.rutgers.edu Tue Oct 18 09:49:23 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 18 Oct 88 09:49:23 EDT Subject: Seminar: A Connectionst Framework for visual recognition Message-ID: <8810181349.AA01080@zztop.rutgers.edu> I saw this posted locally, thought you might like to attend. Don't forget Josh Alspector's talk this Friday (10/21) on his Boltzmann chip! Rutgers University CAIP (Center for Computer Aids for Industiral Productivity. (CAIP) Seminar: A Connectionist Framework for Visual Recognition Ruud Bolle Exploratory Computer Vision Group IBM Thomas J. Watson Research Center Abstract This talk will focus on the organization and implementation of a vision system to recognize 3D objects. The visual world being modeled is assumed to consist of objects that can be represented by planar patches, patches of quadrics of revolution, and the intersection curves of those quadric surfaces. A significant portion of man-made objects can be represented using such prmitives. One of the contributions of this vision system is that fundamentally different feature types, like survface and curve descriptions, and simultaneously extracted and combined to index into a database of objects. The input to the system is a depth map of a scene comprising of one or more objects. From the depth map, surface parameters and surface intersection/object limb parameters are extracted. Parameter extraction is modeled as a set of layered and concurrent parameter space transforms. Any one transform computes only a partial geometric description that forms the input to a next transform. The final transform is a mapping into an object database, which can be viewed as the highest-level of confidence for geomeetric descriptoins and 3D objects within the parameter spaces. The approach is motivated by connectionist model of visual recognition systems. Date: Friday, November 4, 1988 Time: 3:00 PM Place: Conference room 139, CAIP center, Busch Campus For information, call (201) 932-3443 From pratt at paul.rutgers.edu Tue Oct 18 09:59:36 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 18 Oct 88 09:59:36 EDT Subject: Test message, please ignore. Message-ID: <8810181359.AA01171@zztop.rutgers.edu> From terry at cs.jhu.edu Tue Oct 18 18:00:21 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Tue, 18 Oct 88 18:00:21 edt Subject: NIPS Student Travel Awards Message-ID: <8810182200.AA09780@crabcake.cs.jhu.edu> We have around 80 student applications for travel awards for the NIPS meeting in November. All students who are presenting an oral paper or poster will receive $250-500 depending on their expenses. Other students that have applied will very likely receive at least $250 --- but this depends on what the registration looks like at the end of the month. Official letters will go out on November 1. The deadline for 30 day supersaver fares is coming up soon. There is a $200+ savings for staying over Saturday night, so students who want to go to the workshop can actually save travel money by doing so. Terry ----- From INAM%MCGILLB.BITNET at VMA.CC.CMU.EDU Wed Oct 19 22:18:00 1988 From: INAM%MCGILLB.BITNET at VMA.CC.CMU.EDU (INAM000) Date: WED 19 OCT 1988 21:18:00 EST Subject: Outlets for theoretical work Message-ID: Department of Psychology, McGill University, 1205 Avenue Dr. Penfield, Montreal, Quebec, CANADA H3Y 2L2 October 20,1988 Dear Connectionists, Given the recent resurgence of formal analysis of "neural networks" (e.g. White,Gallant,Geman,Hanson,Burr),and the difficulty some people seem to have in finding an appropriate outlet for this work,I would like to remind researchers of the existence of the Journal of Mathematical Psychology.This is an Academic Press Journal that has been in existence for over 20 years,and is quite open to all kinds of mathematical papers in "theoretical" (i.e. mathematical, logical,computational) "psychology" (broadly interpreted). If you want further details regarding the Journal,or feedback about the appropriateness of a particular article,you can contact me by E-mail or telephone (514-398-6128),or contact the Editor directly-J.Townsend,Department of Psychology,Pierce Hall,Rm. 365A,West Lafayette,IND 47907. (E-Mail:KC at BRAZIL.PSYCH.PURDUE.EDU; Telephone: 317-494-6236).The address for manuscript submission is: Journal of Mathematical Psychology, Editorial Office,Seventh Floor, 1250 Sixth Avenue, San Diego, CA 92101. Regards Tony A.A.J.Marley Professor Book Review Editor,Journal of Mathematical Psychology E-MAIL - INAM at MUSICB.MCGILL.CA From MJ_CARTE%UNHH.BITNET at VMA.CC.CMU.EDU Mon Oct 24 11:13:00 1988 From: MJ_CARTE%UNHH.BITNET at VMA.CC.CMU.EDU (MJ_CARTE%UNHH.BITNET@VMA.CC.CMU.EDU) Date: Mon, 24 Oct 88 11:13 EDT Subject: Room mate sought for NIPS/Post-Mtg. Workshop Message-ID: I'm a new faculty member on a tight travel budget, and I'm looking for a room mate to split lodging expenses with at both the NIPS conference and at the post-meeting workshop (need not be the same person at each location). Please reply via e-mail to or call me at (603) 862-1357. Mike Carter Intelligent Structures Group University of New Hampshire From gluck at psych.Stanford.EDU Mon Oct 24 13:07:57 1988 From: gluck at psych.Stanford.EDU (Mark Gluck) Date: Mon, 24 Oct 88 10:07:57 PDT Subject: Reprints avail. Message-ID: Reprints of the following two papers are available by netrequest to gluck at psych.stanford.edu or by writing: Mark Gluck, Dept. of Psychology, Jordan Hall; Bldg. 420, Stanford Univ., Stanford, CA 94305. Gluck, M. A., & Bower, G. H. (1988) From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, V. 117, N. 3, 227-247 Abstract -------- We used adaptive network theory to extend the Rescorla-Wagner (1972) least mean squares (LMS) model of associative learning to phenomena of human learning and judgment. In three experiments subjects learned to categorize hypothetical patients with particular symptom patterns as having certain diseases. When one disease is far more likely than another, the model predicts that subjects will sub- stantially overestimate the diagnosticity of the more valid symptom for the rare disease. The results of Experiments 1 and 2 provide clear support for this prediction in contradistinction to predictions from probability matching, exemplar retrieval, or simple prototype learning models. Experiment 3 contrasted the adaptive network model with one predicting pattern-probability matching when patients always had four symptoms (chosen from four opponent pairs) rather than the presence or absence of each of four symptoms, as in Experiment 1. The results again support the Rescorla-Wagner LMS learning rule as embedded within an adaptive network. Gluck, M. A., Parker, D. B., & Reifsnider, E. (1988) Some biological implications of a differential-Hebbian learning rule. Psychobiology, Vol. 16(3), 298-302 Abstract -------- Klopf (1988) presents a formal real-time model of classical conditioning which generates a wide range of behavioral Pavlovian phenomena. We describe a replication of his simulation results and summarize some of the strengths and shortcomings of the drive- reinforcement model as a real-time behavioral model of classical conditioning. To facilitate further comparison of Klopf's model with neuronal capabilities, we present a pulse-coded reformulation of the model that is more stable and easier to compute than the original, frequency-based model. We then review three ancillary assumptions to the model's learning algorithm, noting that each can be seen as dually motivated by both behavioral and biological considerations. From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Oct 25 21:24:04 1988 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 25 Oct 88 21:24:04 EDT Subject: Political science viewed as a Neural Net :-) Message-ID: I was thinking about the upcoming U.S. election today, and it occurred to me that the seemingly useless electoral college mandated by the U.S. constitution might actually be of some value. A direct democratic election is basically a threshold decision function with lots of inputs and with fixed weights; add the electoral college and you've got a layered network with fifty hidden units, each with a non-linear threshold function. A direct election can only choose a winner based on some linearly separable function of voter opinions. You would expect to see complex issues forcibly projected onto some crude 1-D scale (e.g. "liberal" vs. "conservative" or "wimp" vs. "macho"). With a multi-layer decision network the system should be capable of performing a more complex separation of the feature space. Though they lacked the sophisticated mathematical theories available today, the designers of our constitution must have sensed the severe computational limitations of direct democracy and opted for the more complex decision system. Unfortunately, we do not seem to be getting the full benefit of this added flexibility. What the founding fathers left out of this multi-layer network is a mechanism for adjusting the weights in the network based on how well the decision ultimately turned out. Perhaps some form of back-propagation would work here. It might be hard to agree on a proper error measure, but the idea seems worth exploring. For example, everyone who voted for Nixon in 1972 should have the weight of his his future votes reduced by epsilon; a large momentum term would be added to the reduction for those people who had voted for Nixon previously. The reduction would be greater for voters in states where the decision was close (if any such states can be found). There is already a mechanism in place for altering the output weights of the hidden units: those states that correlate positively with the ultimate decision end up with more political "clout", then with more defense-related jobs. This leads to an influx of people and ultimately to more electoral votes for that state. Some sort of weight-decay term would be needed to prevent a runaway process in which all of the people end up in California. We might also want to add more cross-connections in the network. At present, each voter affects only one hidden unit, the state where he resides. This somewhat limits the flexibility of the learning process in assigning arbitrary functions to the hidden units. To fix this, we could allow voters to register in more than one state. George Bush has five or six home states; why not make this option available to all voters? More theoretical analysis of this complex system is needed. Perhaps NSF should fund a center for this kind of thinking. The picture is clouded by the observation that individual voters are not simple predicates: most of them have a rudimentary capacity for simple inference and in some cases they even exhibit a form of short-term learning. However, these minor perturbations probably cancel out on the average, and can be treated as noise in the decision units. Perhaps the amount of noise can be manipulated to give a crude approximation to simulated annealing. -- Scott From pratt at paul.rutgers.edu Wed Oct 26 09:15:09 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 26 Oct 88 09:15:09 EDT Subject: Neural networks, intelligent machines, and the AI wall: Jack Gelfand Message-ID: <8810261315.AA10043@paul.rutgers.edu> This looks like it'll be an especially interesting and controversial talk for our department. I hope you all can make it! --Lori Fall, 1988 Neural Networks Colloquium Series at Rutgers NEURAL NETWORKS, INTELLIGENT MACHINES AND THE AI WALL Jack Gelfand The David Sarnoff Research Center SRI International Princeton,N.J. Room 705 Hill center, Busch Campus Friday November 4, 1988 at 11:10 am Refreshments served before the talk When we look back at the last 25 years of AI research, we find that there have been many new techniques which have promised to produce intelligent machines for real world applications. Though the performance of some of these machines is quite extraordinary, very few have approached the performance of human beings for even the most rudimentary tasks. We believe that this is due to the fact that these methods have been largely monolithic, whereas biological systems approach these problems by combining many different modes of processing into integrated systems. A number of real and artificial neural network systems will be discussed in terms of how knowledge is represented, combined and processed in order to solve complex problems. From pratt at paul.rutgers.edu Wed Oct 26 12:27:42 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 26 Oct 88 12:27:42 EDT Subject: Looking for references to work on connectionism & databases Message-ID: <8810261627.AA01025@zztop.rutgers.edu> I am interested in work which might relate neural networks to databases in any way. Please send me any relevant references. Thanks, Lori From JRICE at cs.tcd.ie Thu Oct 27 06:23:00 1988 From: JRICE at cs.tcd.ie (JRICE@cs.tcd.ie) Date: Thu, 27 Oct 88 10:23 GMT Subject: Looking for references on connectionism and Protein Structure Message-ID: I am interested in applying connectionism to the generation of protein secondary and tertiary structure from the primary amino acid sequence. Could you please send me any relevent references. Thanks, John. From terry Thu Oct 27 09:19:18 1988 From: terry (Terry Sejnowski ) Date: Thu, 27 Oct 88 09:19:18 edt Subject: Looking for references on connectionism and Protein Structure Message-ID: <8810271319.AA05074@crabcake.cs.jhu.edu> The best existing method for predicting the secondary structure of a globular protein is a neural network: Qian, N. and Sejnowski, T. J. (1988) Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202, 865-884. Our results also indicate that only marginal improvements on our performance will be possible with local methods. Tertiary (3-D) structure is a much more difficult problem for which there are no good methods. Terry ----- From steeg at ai.toronto.edu Thu Oct 27 11:13:54 1988 From: steeg at ai.toronto.edu (Evan W. Steeg) Date: Thu, 27 Oct 88 11:13:54 EDT Subject: Looking for references on connectionism and Protein Structure Message-ID: <88Oct27.111408edt.7119@neat.ai.toronto.edu> In addition to the Qian & Sejnowski work, a potent illustration of the capabilities of (even today's rather simplistic) neural nets, there is: Bohr, N., Bohr, J., Brunak, S., Cotterill, R.M.J., Lautrup, B., Norskov, L., Olsen, O.H., and Petersen, S.B. (of the Bohr Institute and Tech. Univ. of Denmark), "Revealing Protein Structure by Neural Networks", presented as a poster at the Fourth International Symposium on Biological and Artificial Intelligence Systems, Trento, Italy 1988. Presumably, a paper will be published shortly. They use a feed-forward net and back-propagation, like Qian and Sejnowski, but use separate nets for each of 3 kinds of protein secondary structure, rather than a single net, and there are other methodological differences as well. Others, myself included, are using neural net techniques which exploit global (from the whole molecule), in addition to local interactions. This should, as Dr. Sejnowski pointed out, lead to more accurate structure prediction. Results of this work will begin to appear within a couple of months. -- Evan Evan W. Steeg (416) 978-7321 steeg at ai.toronto.edu (CSnet,UUCP,Bitnet) Dept of Computer Science steeg at ai.utoronto (other Bitnet) University of Toronto, steeg at ai.toronto.cdn (EAN X.400) Toronto, Canada M5S 1A4 {seismo,watmath}!ai.toronto.edu!steeg From hinton at ai.toronto.edu Thu Oct 27 15:39:44 1988 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Thu, 27 Oct 88 15:39:44 EDT Subject: How to ensure that back-propagation separates separable patterns Message-ID: <88Oct27.130054edt.6408@neat.ai.toronto.edu> We have recently obtained the following results, and would like to know if anyone has seen them previously written up. The description is informal but accurate: There has been some interest in understanding the behavior of backpropagation in feedforward nets with no hidden neurons. This is a particularly simple case that does not require the full power of back-propagation (it can also be approached from the point of view of perceptrons), but we believe that it provides useful insights into the structure of the error surface, and that understanding this case is a prerequisite for understanding the more general case with hidden layers. In [1], the authors give examples illustrating the fact that while a training set may be separable, a net performing backpropagation (gradient) search may get stuck in a solution which fails to separate. The objective of this note is to point out that if one uses instead a threshold procedure, where we do not penalize values ``beyond'' the targets, then such counterexamples cease to exist, and in fact that one has a convergence theorem that closely parallels that for perceptrons: the continuous gradient adjustment procedure is such that, from any initial weight configuration, a separating set of weights is obtained in finite time. We also show how to modify the example given in [2] to conclude that there is a training set consisting of 125 binary vectors and a network configuration for which there are nonglobal local minima, even if threshold LMS is used. In this example, the training set is of course not separable. Finally, we compare our results to more classical pattern recognition theorems, in particular to the convergence of the relaxation procedure for threshold LMS with linear output units ([3]), and show that the continuous adjustment approach is essential in the study of the nonlinear case. Another essential difference with the linear case is that in the latter nonglobal local minima cannot happen even if the data is not separable. References: [1] Brady, M., R. Raghavan and J. Slawny, ``Backpropagation fails to separate where perceptrons succeed,'' submitted for publication. Summarized version in ``Gradient descent fails to separate,'' in {\it Proc. IEEE International Conference on Neural Networks}, San Diego, California, July 1988, Vol. I, pp.649-656. [2] Sontag, E.D., and H.J. Sussmann, ``Backpropagation can give rise to spurious local minima even for networks without hidden layers,'' submitted. [3] Duda, R.O., and P.E. Hart, {\it Pattern Classificationa and Scene Analysis}, Wiley, New York, 1973. Geoff Hinton Eduardo Sontag Hector Sussman From rpl at ll-sst.arpa Fri Oct 28 09:42:23 1988 From: rpl at ll-sst.arpa (Richard Lippmann) Date: Fri, 28 Oct 88 09:42:23 EDT Subject: reply on back-propagation fails to separate paper Message-ID: <8810281342.AA03049@ll-sst.arpa> Geoff, We came up with the same conclusion a while ago when some people were worried about the performance of back propagation but never published it. Back propagation with limits seems to converge correctly for those contrived deterministic cases where minimizing total squared error does not minimize the percent patterns classified correctly. The limits cause the algorithm to change from an LMS mean-square minimizing approach to perceptron-like error corrective approach. Typically, however, the difference in percent patterns classified correctly between the local and global solutions in those cases tends to be small. In practice, we found that convergence for the one contrived case we tested with limits took rather a long time. I have never seen this published and it would be good to see your result published with a convergence proof. I have also seen little published on the effect of limits on performance of classifiers or on final weight values. Rich From Mark.Derthick at MCC.COM Fri Oct 28 14:27:00 1988 From: Mark.Derthick at MCC.COM (Mark.Derthick@MCC.COM) Date: Fri, 28 Oct 88 13:27 CDT Subject: TR available Message-ID: <19881028182725.0.DERTHICK@THORN.ACA.MCC.COM> For copies of my thesis, ask copetas at cs.cmu.edu for CMU-CS-88-182 "Mundane Reasoning by Parallel Constraint Satisfaction." I am 1200 miles away from the reports, so asking me doesn't do you any good: Mark Derthick MCC 3500 West Balcones Center Drive Austin, TX 78759 (512)338-3724 Derthick at MCC.COM If you have previously asked me for this report, it should be arriving soon. There aren't many extra copies right now, so requests to copetas may be delayed for a while. ABSTRACT Connectionist networks are well suited to everyday common sense reasoning. Their ability to simultaneously satisfy multiple soft constraints allows them to select from conflicting information in finding a plausible interpretation of a situation. However these networks are poor at reasoning using the standard semantics of classical logic, based on truth in all possible models. This thesis shows that using an alternate semantics, based on truth in a single most plausible model, there is an elegant mapping from theories expressed using the syntax of propositional logic onto connectionist networks. An extension of this mapping to allow for limited use of quantifiers suffices to build a network from knowledge bases expressed in a frame language similar to KL-ONE. Although finding optimal models of these theories is intractable, the networks admit a fast hill climbing search algorithm that can be tuned to give satisfactory answers in familiar situations. The Role Shift problem illustrates the potential of this approach to harmonize conflicting information, using structured distributed representations. Although this example works well, much remains before realistic domains are feasible. From mehra at aquinas.csl.uiuc.edu Fri Oct 28 12:44:02 1988 From: mehra at aquinas.csl.uiuc.edu (Pankaj Mehra) Date: Fri, 28 Oct 88 11:44:02 CDT Subject: reply on back-propagation fails to separate paper Message-ID: <8810281644.AA06760@aquinas> Hi everybody. When I heard Brady et al.'s talk at ICNN-88, I thought that the results simply pointed out that a correct approach to classification may not give equal importance to all training samples. As is well-known, classical back-prop converges to a separating surface that depends on the LMS error summed uniformly over all training samples. I think that the new results provide a case for attaching more importance to the elements on concept boundaries. I have been working on this problem (of trying to characterize "boundary" elements) off and on, without much success. Basically, geometric characterizations exist but they are too complex to evaluate. What is interesting, however, is the fact that complexity of learning (hence, the time for convergence) depend on the nature of the separating surface. Theoretical results also involve similar concepts, e.g. VC-dimension. Also notice that if one could somehow "learn" the characteristic of boundary elements, then one could ignore a large part of the training sample and still converge properly using a threshold procedure like that suggested in Geoff's note. Lastly, since back-prop is not constrained to always use LMS as the error function, one wonders if there is an intelligent method (that can be automated) for constructing error functions based on the complexity of the separating surface. - Pankaj Mehra {mehra%aquinas at uxc.cso.uiuc.edu} From pratt at paul.rutgers.edu Fri Oct 28 16:54:11 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Fri, 28 Oct 88 16:54:11 EDT Subject: Production rule LHS pattern matching with neural nets Message-ID: <8810282054.AA02887@zztop.rutgers.edu> Hello, I'm interested in any work involving matching the left-hand side of a production rule using a neural network. I imagine that one could use a representation language which isn't first order logic which could be used for the LHS, also and which might be more amenable to a neural network approach. Thanks (again!) for any pointers, Lori ------------------------------------------------------------------- Lorien Y. Pratt Computer Science Department pratt at paul.rutgers.edu Rutgers University Busch Campus (201) 932-4634 Piscataway, NJ 08854 From mkon at bu-cs.BU.EDU Fri Oct 28 19:35:30 1988 From: mkon at bu-cs.BU.EDU (mkon@bu-cs.BU.EDU) Date: Fri, 28 Oct 88 19:35:30 EDT Subject: reply on back-propagation fails to separate paper In-Reply-To: Pankaj Mehra's message of Fri, 28 Oct 88 11:44:02 CDT <8810281644.AA06760@aquinas> Message-ID: <8810282335.AA24173@bucsd.bu.edu> I would appreciate a preprint or any related information (say, other references) related to the discussion you presented on the connectionist network today. Thanks in advance. Mark A. Kon Department of Mathematics Boston University Boston, MA 02215 mkon at bu-cs.bu.edu From sontag at fermat.rutgers.edu Sat Oct 29 10:50:52 1988 From: sontag at fermat.rutgers.edu (Eduardo Sontag) Date: Sat, 29 Oct 88 10:50:52 EDT Subject: reply on back-propagation fails to separate paper In-Reply-To: <8810282335.AA24173@bucsd.bu.edu> (mkon@bu-cs.bu.edu) Message-ID: <8810291450.AA21199@control.rutgers.edu> Mark, Re your question to Mehra about complexity of boundaries and VC dimension, we just had a talk at Rutgers yesterday by Eric Baum (baum at pupgg.princeton.edu) about this. You should ask him for copies of his papers on the subject, which also contain references to Valiant's and other related work. I think that the relation between Brady et.al. and our results can be explained better in terms of threshold vs nonthreshold costs than in terms of relative weightings of terms. -eduardo Eduardo D. Sontag Rutgers Center for Systems and Control (SYCON) Rutgers University (sontag at fermat.rutgers.edu) From ajr at DSL.ENG.CAM.AC.UK Mon Oct 31 06:14:50 1988 From: ajr at DSL.ENG.CAM.AC.UK (Tony Robinson) Date: Mon, 31 Oct 88 11:14:50 GMT Subject: Tech report available Message-ID: <2402.8810311114@dsl.eng.cam.ac.uk> Here is the summary of a tech report which demonstates that the error propagation algorithm is not limited to weighted-sum type nodes, but can be used to train radial-basis-function type nodes and others. Send me some email if you would like a copy. Tony. P.S. If you asked for a copy of my/our last paper, I've taken the liberty of sending you a hard copy of this one as well. Thank you for replying to ajr at dsl.eng.cam.ac.uk not connectionists at ... `'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`' Generalising the Nodes of the Error Propagation Network CUED/F-INFENG/TR.25 A J Robinson, M Niranjan, F Fallside Cambridge University Engineering Department Trumpington Street, Cambridge, England email: ajr at uk.ac.cam.eng.dsl 1 November 1988 Gradient descent has been used with much success to train connectionist models in the form of the Error Propagation Network (Rumelhart Hinton and Williams, 1986). In these nets the output of a node is a non-linear function of the weighted sum of the activations of other nodes. This type of node defines a hyper-plane in the input space, but other types of nodes are possible. For example, the Kanerva Model (Kanerva 1984), the Modified Kanerva Model (Prager and Fallside 1988), networks of Spherical Graded Units (Hanson and Burr, 1987), networks of Localised Receptive Fields (Moody and Darken, 1988) and the method of Radial Basis Functions (Powell, 1985; Broomhead and Lowe 1988) all use nodes which define volumes in the input space. Niranjan and Fallside (1988) summarise these and compare the class boundaries formed by this family of networks with feed-forward networks and nearest neighbour classifiers. This report shows that the error propagation algorithm can be used to train general types of node. The example of a gaussian node is given and this is compared with other connectionist models for the problem of recognition of steady state vowels from multiple speakers. From pratt at paul.rutgers.edu Mon Oct 31 12:57:09 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Mon, 31 Oct 88 12:57:09 EST Subject: Paul Thagard to speak on Analogical thinking Message-ID: <8810311757.AA04514@zztop.rutgers.edu> COGNITIVE PSYCHOLOGY FALL COLLOQUIUM SERIES (Rutgers University) Date: 9 November 1988 Time: 4:30 PM Place: Room 307, Psychology Building, Busch Campus Paul Thagard, Cognitive Science Program, Princeton University ANALOGICAL THINKING Analogy is currently a very active area of research in both cognitive psychology and artificial intelligence. Keith Holyoak and I have developed connectionist models of analogical retrieval and mapping that are consistent with the results of psychological experiments. The new models use localist networks to simultaneously satisfy a set of semantic, structural, and pragmatic constraints. After providing a general view of analogical thinking, this talk will describe our model of analog retrieval. From pratt at paul.rutgers.edu Mon Oct 31 13:13:18 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Mon, 31 Oct 88 13:13:18 EST Subject: Schedule of remaining talks this semester Message-ID: <8810311813.AA04698@zztop.rutgers.edu> Speaker schedule as of 10/31/88 for end of the semester talks in the Fall, 1988 Neural Networks Colloquium Series at Rutgers. Speaker Date Title ------- ---- ----- Jack Gelfand 11/4/88 Neural networks, Intelligent Machines, and the AI wall Mark Jones 11/11/88 Knowledge representation in connectionist networks, including inheritance reasoning and default logic. E. Tzanakou 11/18/88 ALOPEX: Another optimization method Stefan Shrier 12/9/88 Abduction Machines for Grammar Discovery From cpd at CS.UCLA.EDU Mon Oct 31 16:28:05 1988 From: cpd at CS.UCLA.EDU (Charles Dolan) Date: Mon, 31 Oct 88 13:28:05 PST Subject: Tech report on connectionist knowledge processing Message-ID: <881031.212805z.29548.cpd@oahu.cs.ucla.edu> Implementing a connectionist production system using tensor products September, 1988 UCLA-AI-88-15 CU-CS-411-88 Charles P. Dolan Paul Smolensky AI Center Department of Computer Science & Hughes Research Labs Institute of Cognitive Science 3011 Malibu Canyon Rd. University of Colorado Malibu, CA 90265 Boulder, CO 80309-0430 & UCLA AI Laboratory Abstract In this paper we show that the tensor product technique for constructing variable bindings and for representing symbolic structure-used by Dolan and Dyer (1987) in parts of a connectionist story understanding model, and analyzed in general terms in Smolensky (1987)-can be effectively used to build a simplified version of Touretzky & Hinton's (1988) Distributed Connectionist Production System. The new system is called the Tensor Product Product System (TPPS). Copyright c 1988 by Charles Dolan & Paul Smolensky. For copies send a message to valerie at cs.ucla.edu at UCLA or kate at boulder.colorado.edu Boulder From VIJAYKUMAR at cs.umass.EDU Mon Oct 31 14:57:00 1988 From: VIJAYKUMAR at cs.umass.EDU (Vijaykumar Gullapalli 545-1596) Date: Mon, 31 Oct 88 15:57 EDT Subject: Tech. Report available Message-ID: <8810312059.AA10948@crash.cs.umass.edu> The following Tech. Report is available. Requests should be sent to "SMITH at cs.umass.edu". A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement Feedback Vijaykumar Gullapalli COINS Technical Report 88-91 University of Massachusetts Amherst, MA 01003 ABSTRACT Reinforcement learning is the process by which the probability of the response of a system to a stimulus increases with reward and decreases with punishment. Most of the research in reinforcement learning (with the exception of the work in function optimization) has been on problems with discrete action spaces, in which the learning system chooses one of a finite number of possible actions. However, many control problems require the application of continuous control signals. In this paper, we present a stochastic reinforcement learning algorithm for learning functions with continuous outputs. Our algorithm is designed to be implemented as a unit in a connectionist network. We assume that the learning system computes its real-valued output as some function of a random activation generated using the Normal distribution. The activation at any time depends on the two parameters, the mean and the standard deviation, used in the Normal distribution, which, in turn, depend on the current inputs to the unit. Learning takes place by using our algorithm to adjust these two parameters so as to increase the probability of producing the optimal real value for each input pattern. The performance of the algorithm is studied by using it to learn tasks of varying levels of difficulty. Further, as an example of a potential application, we present a network incorporating these real-valued units that learns the inverse kinematic transform of a simulated 3 degree-of-freedom robot arm. From terry at cs.jhu.edu Mon Oct 31 19:35:09 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Mon, 31 Oct 88 19:35:09 est Subject: reply on back-propagation fails to separate paper Message-ID: <8811010035.AA17520@crabcake.cs.jhu.edu> The proceedings of the CMU Connectionist Models Summer School has two papers on optimal choice of training set based on "critical" or "boundary" patterns: Karen Huyser on boolean functions and Ahmad Subatai on the majority function. The proceedings are available from Morgan Kaufmann. Terry ----- From neural!jsd at ihnp4.att.com Sun Oct 30 00:08:28 1988 From: neural!jsd at ihnp4.att.com (neural!jsd@ihnp4.att.com) Date: Sun, 30 Oct 88 00:08:28 EDT Subject: We noticed LMS fails to separate Message-ID: <8810300407.AA13067@neural.UUCP> Yes, we noticed that a Least-Mean-Squares (LMS) network even with no hidden units fails to separate some problems. Ben Wittner spoke at the IEEE NIPS meeting in Denver, November 1987, describing TWO failings of this type. He gave an example of a situation in which LMS algorithms (including those commonly referred to as back-prop) are metastable, i.e. they fail to separate the data for certain initial configurations of the weights. He went on to describe another case in which the algorithm actually leaves the solution region after starting within it. He also pointed out that this can lead to learning sessions in which the categorization performance of back-prop nets (with or without hidden units) is not a monotonically improving function of learning time. Finally, he presented a couple of ways of modifying the algorithm to get around these problems, and proved a convergence theorem for the modified algorithms. One of the key ideas is something that has been mentioned in several recent postings, namely, to have zero penalty when the training pattern is well-classified or "beyond". We cited Minsky & Papert as well as Duda & Hart; we believe they were more-or-less aware of these bugs in LMS, although they never presented explicit examples of the failure modes. Here is the abstract of our paper in the proceedings, _Neural Information Processing Systems -- Natural and Synthetic_, Denver, Colorado, November 8-12, 1987, Dana Anderson Ed., AIP Press. We posted the abstract back in January '88, but apparently it didn't get through to everybody. Reprints of the whole paper are available. Strategies for Teaching Layered Networks Classification Tasks Ben S. Wittner (1) John S. Denker AT&T Bell Laboratories Holmdel, New Jersey 07733 ABSTRACT: There is a widespread misconception that the delta-rule is in some sense guaranteed to work on networks without hidden units. As previous authors have mentioned, there is no such guarantee for classification tasks. We will begin by presenting explicit counter-examples illustrating two different interesting ways in which the delta rule can fail. We go on to provide conditions which do guarantee that gradient descent will successfully train networks without hidden units to perform two-category classification tasks. We discuss the generalization of our ideas to networks with hidden units and to multi-category classification tasks. (1) Currently at NYNEX Science and Technology / 500 Westchester Ave. White Plains, NY 10604 From lwc%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK Tue Oct 4 12:44:44 1988 From: lwc%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK (Laiwan Chan) Date: Tue, 4 Oct 88 12:44:44 BST Subject: Roommate for NIPS*88 in Denver Message-ID: <20404.8810041144@dsl.eng.cam.ac.uk> There is another poor student looking for someone who is willing to share a hotel room in the NIPS conference in Denver. I will attend the conference and the post workshop, and would like to find a lady to share a hotel room and the cost with me during that period. Alternative suggestions for cheaper hotels/dormitories/shelters etc nearby are very welcome. For your convenience, I am a Ph.D. student working in Engineering Department of Cambridge University. Thanks in Advance, Lai-Wan CHAN, (Miss), Engineering Dept., Trumpington Street, Cambridge, CB2 1PZ, England. email : lwc at uk.ac.cam.eng.dsl%uk.ac.rl.earn From pratt at paul.rutgers.edu Tue Oct 4 13:59:07 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 4 Oct 88 13:59:07 EDT Subject: Josh Alspector to speak at Rutgers on neural network learning chip Message-ID: <8810041759.AA12607@zztop.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers Electronic Models of Neuromorphic Networks ------------------------------------------ Joshua Alspector Bellcore, Morristown, NJ 07960 Room 705 Hill center, Busch Campus Piscataway, NJ Friday October 21, 1988 at 11:10 am Refreshments served before the talk Abstract We describe how current models of computation in the brain can be physically implemented using VLSI technology. This includes modeling of sensory processes, memory, and learning. We have fabricated a test chip in 2 micron CMOS that can perform supervised learning in a manner similar to the Boltzmann machine. The chip learns to solve the XOR problem in a few milliseconds. Patterns can be presented to it at 100,000 per second. We also have demonstrated the capability to do unsupervised competitive learning as well as supervised learning. From fortes at ee.ecn.purdue.edu Tue Oct 4 15:22:22 1988 From: fortes at ee.ecn.purdue.edu (Jose A Fortes) Date: Tue, 4 Oct 88 14:22:22 EST Subject: Hector Sussmann to speak on formal analysis of Boltzmann Machine Learning Message-ID: <8810041922.AA19227@ee.ecn.purdue.edu> To:sussmann at math.rutgers.edu, fortes From:fortes at ee.ecn.purdue.edu Subject:technical information Could you please forward to me any technical papers/reports that discuss the work that you intend to report on at the seminar mentioned below. My address is Jose A.B. Fortes Assistant Professor School of Electrical Engineering Purdue University W. Lafayette, IN 47907 Thanks a lot. J. Fortes +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Fall, 1988 Neural Networks Colloquium Series at Rutgers On the theory of Boltzmann Machine Learning ------------------------------------------- Hector Sussmann Rutgers University Mathematics Department Room 705 Hill center, Busch Campus Friday October 14, 1988 at 11:10 am Refreshments served before the talk Abstract The Boltzmann machine is an algorithm for learning in neural networks, involving alternation between a ``learning'' and ``hallucinating'' phase. In this talk, I will present a Boltzmann machine algorithm for which it can be proven that, for suitable choices of the parameters, the weights converge so that the Boltzmann machine correctly classifies all training data. This is because the evolution of the weights follow very closely, with very high probability, an integral trajectory of the gradient of the likelihood function whose global maxima are exactly the desired weight patterns. From terry Thu Oct 6 13:26:55 1988 From: terry (Terry Sejnowski ) Date: Thu, 6 Oct 88 13:26:55 edt Subject: Neural Computation Message-ID: <8810061726.AA04156@crabcake.cs.jhu.edu> Neural Computation, a new journal published by the MIT Press, is on schedule for its first issue in February, 1989. We have about 20 very high quality papers in review, with many exciting new results. The first issue will also contain a long review on neural networks for speech recognition by Richard Lippmann --- send him papers that should be considered: Richard Lippmann B-349 MIT Lincoln Laboratory Lexington, MA 02173 Please note that all research communications should be short -- up to 2000 words and 4 figures. These are intended as announcements of new and important results that merit a wide audience. Note that comparable journals in physics and biology, such as Physical Review Letters and Nature, are among the most prestigious research journals. Unlike the short research communications that appear in some specialty journals, papers in Neural Computation will have high visibility and wide circulation. Publication will be fast --- 3 months turnaround will be guaranteed from the day of acceptance for the first year. Terry Sejnowski ----- From terry at cs.jhu.edu Thu Oct 6 17:03:55 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Thu, 6 Oct 88 17:03:55 edt Subject: Neural Computation Subscriptions Message-ID: <8810062103.AA06559@crabcake.cs.jhu.edu> Subscriptions to Neural Computation are available for: $45.00 Individual $90.00 Institution (add $9.00 surface mail or $17.00 postage for outside US and Canada) Available from: MIT Press Journals 55 Hayward Street Cambridge, MA 02142 (617) 253 2889 for credit card orders Terry ----- From skrzypek at CS.UCLA.EDU Thu Oct 6 20:12:10 1988 From: skrzypek at CS.UCLA.EDU (Dr Josef Skrzypek) Date: Thu, 6 Oct 88 17:12:10 PDT Subject: Technical Report In-Reply-To: 's message of 5-OCT-1988 14:49:48.60 <8810060056.AA27357@hera.cs.ucla.edu> Message-ID: <8810070012.AA01808@lanai.cs.ucla.edu> Dear James, Could you please send me one copy of your TR "Representing Simple Arithmetic in Neural Networks" Thank you Josef Prof. Josef Skrzypek SKRZYPEK at cs.ucla.edu Department of Computer Science 3532 BH UCLA Los Angeles CA 90024 From prlb2!welleken at uunet.UU.NET Sat Oct 8 13:00:24 1988 From: prlb2!welleken at uunet.UU.NET (Wellekens) Date: Sat, 8 Oct 88 18:00:24 +0100 Subject: report available Message-ID: <8810081700.AA20933@prlb2.UUCP> The following report is available free of charge from Chris.J.Wellekens, Philips Research Laboratory Brussels, 2 Avenue van Becelaere, B-1170 Brussels,Belgium. Email wlk at prlb2.uucp LINKS BETWEEN MARKOV MODELS AND MULTILAYER PERCEPTRONS H.Bourlard and C.J.Wellekens Philips Research Laboratory Brussels ABSTRACT Hidden Markov models are widely used for automatic speech recognition. They inherently incorporate the sequential character of the speech signal and are statistically trained. However, the a priori choice of a model topology limits the flexibility of the HMM's. Another drawback of these models is their weak discriminating power. Multilayer perceptrons are now promising tools in the connectionist approach for classification problems and have already been successfully tested on speech recognition problems. However, the sequential nature of the speech signal remains difficult to handle in that kind of machine. In this paper, a discriminant hidden Markov model is defined and it is shown how a particular multilayer perceptron with contextual and extra feedback input units can be considered as a general form of such Markov models. Relations with other recurrent networks commonly used in speech recognition are also pointed out. Chris From niranjan%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK Mon Oct 10 11:59:27 1988 From: niranjan%digsys.engineering.cambridge.ac.uk at NSS.Cs.Ucl.AC.UK (M. Niranjan) Date: Mon, 10 Oct 88 11:59:27 BST Subject: Abstract Message-ID: <15317.8810101059@dsl.eng.cam.ac.uk> Here is an extended summary of a Tech report now available. Apologies for the incomplete de-TeXing. niranjan PS: Remember, reprint requests should be sent to "niranjan at dsl.eng.cam.ac.uk" and NOT "connectionists at q.cs.cmu.edu" ============================================================================= NEURAL NETWORKS AND RADIAL BASIS FUNCTIONS IN CLASSIFYING STATIC SPEECH PATTERNS Mahesan Niranjan & Frank Fallside CUED/F-INFENG/TR 22 University Engineering Department Cambridge, CB2 1PZ, England Email: niranjan at dsl.eng.cam.ac.uk SUMMARY This report compares the performances of three non-linear pattern classifiers in the recognition of static speech patterns. Two of these classifiers are neural networks (Multi-layered perceptron and the modified Kanerva model (Prager & Fallside, 1988)). The third is the method of radial basis functions (Broomhead & Lowe, 1988). The high performance of neural-network based pattern classifiers shows that simple linear classifiers are inadequate to deal with complex patterns such as speech. The Multi-layered perceptron (MLP) gives a mechanism to approximate an arbitrary classification boundary (in the feature space) to a desired precision. Due to this power and the existence of a simple learning algorithm (error back-propagation), this technique is in very wide use nowadays. The modified Kanerva model (MKM) for pattern classification is derived from a model of human memory (Kanerva, 1984). It attempts to take advantage of certain mathematical properties of binary spaces of large dimensionality. The modified Kanerva model works with real valued inputs. It compares an input feature vector with a large number of randomly populated `location cells' in the input feature space; associated with every cell is a `radius'. Upon comparison, the cell outputs value 1 if the input vector lies within a volume defined by the radius; its output is zero otherwise. The discrimi- nant function of the Modified Kanerva classifier is a weighted sum of these location-cell outputs. It is trained by a gradient descent algorithm. The method of radial basis functions (RBF) is a technique for non-linear discrimination. RBFs have been used by Powell (1985) in multi-variable interpolation. The non-linear discriminant function in this method is of the form, g( x) = sum_j=1^m lambda_j phi (||x - x_j||) Here, x is the feature vector. lambda_j are weights associated with each of the given training examples x_j. phi is a kernel function defining the range of influence of each data point on the class boundary. For a particular choice of the phi function, and a set of training data {x_j,f_j}, j=1,...,N, the solution for the lambda_j s is closed-form. Thus this technique is computationally simpler than most neural networks. When used as a non- parametric technique, each computation at classification stage involves the use of all the training examples. This, however, is not a disadvantage since much of this computing can be done in parallel. In this report, we compare the performance of these classifiers on speech signals. Several techniques similar to the method of radial basis functions are reviewed. The properties of the class boundaries generated by the MLP, MKM and RBF are derived on simple two dimensional examples and an experimental comparison with speech data is given. ============================================================================ From terry Tue Oct 11 21:40:56 1988 From: terry (Terry Sejnowski ) Date: Tue, 11 Oct 88 21:40:56 edt Subject: NIPS Early Registration is October 15 Message-ID: <8810120140.AA19961@crabcake.cs.jhu.edu> ***** Note: Deadline for early registration discounts is October 15 ***** Mail in registration form below: IEEE Conference on "NEURAL INFORMATION PROCESSING SYSTEMS - NATURAL AND SYNTHETIC" November 28 - December 1, 1988 (Mon-Thurs), Sheraton Denver Tech Center, Denver, Colorado with a Post Meeting Workshop, December 1-4 Keystone Resort, Colorado The program stresses interdisciplinary interactions. All papers have been thoroughly refereed. Plenary lectures will bridge the gap between engineering and biology. -------------------------------------------------------------------- REGISTRATION FORM: NAME: Last First Middle Initial Business or Academic Title Professional Affiliation Street Address and Internal Mail Code City State Zip Country (if not U.S.) Telephone FEES: (Registration includes Monday reception, Wednesday night banquet and 3 Continental breakfasts.) Conference: Early (before Oct. 15, 1988) $ 175 Late (after Oct. 15, 1988) $ 225 Early Full-time students, with I.D. $ 35 Late Full-time students $ 50 Registration includes the welcoming reception Monday night, the banquet Wednesday night, and Continental breakfast all three days. Registration for the post meeting workshop is separate. Post Meeting Workshop (Deadline Oct. 15): Post-meeting workshop $ 75 Post-meeting workshop, students $ 60 Enclosed is a check or money order in U.S. dollars for $________ (Please make check payable to the Neural Information Processing Conference) Financial support may be available for students (see previous page) Please mail form to: Dr. Clifford Lau ONR Code 1114SE 800 North Quincy Street BCT #1 Arlington, Virginia 22217 FINANCIAL SUPPORT: Modest financial support for travel may be available to students, young faculty, and some senior faculty changing fields to work on neural networks. Those requesting support should write a one page summary of their background, research interest, and include a curriculum vitae, and mail to the Chairman, Dr. Terry Sejnowski, Dept. of Biophysics, Johns Hopkins University, Baltimore, MD, 21218. Applicants will be notified of awards (typically $250-500) by November 1. ---------------------------------------------------------------------- PROGRAM HIGHLIGHTS More than 300 papers were submitted to the conference; each was refereed by multiple referees. A number of invited talks will survey active areas of research and lead the way to contributed oral presentations. The following is the currently planned program. Monday, November 28, 1988 8:00 PM: Wine and Cheese Reception, Denver Tech Center Tuesday November 29, 1988 SESSION O1: Learning and Generalization Invited Talk 8:30 O1.1: "Birdsong Learning", Mark Konishi, Division of Biology, California Institute of Technology Contributed Talks 9:10 O1.2: "Comparing Generalization by Humans and Adaptive Networks", M. Pavel, M.A. Gluck, V. Henkle, Department of Psychology, Stanford University 9:40 O1.3: "An Optimality Principle for Unsupervised Learn- ing", T. Sanger, AI Lab, MIT 10:10 Break 10:30 O1.4: "Learning by Example with Hints", Y.S. Abu- Mostafa, California Institute of Technology, Department of Electrical Engineering 11:00 O1.5: "Associative Learning Via Inhibitory Search", D.H. Ackley, Cognitive Science Research Group, Bell Communi- cation Research, Morristown NJ 11:30 O1.6: "Speedy Alternatives to Back Propagation", J. Moody, C. Darken, Computer Science Department, Yale Univer- sity 12:00 Poster Session SESSION O2: Applications Tuesday Afternoon Invited Talk 2:20 O2.1: "Speech Recognition," John Bridle, Royal Radar Establishment, Malvern, U.K. Contributed Talks 3:00 O2.2: "Modularity in Neural Networks for Speech Recog- nition," A. Waibel, ATR Interpreting Telephony Research Lab- oratories, Osaka, Japan 3:30 O2.3: "Applications of Error Back-propagation to Pho- netic Classification," H.C. Leung, V.W. Zue, Department of Electrical Eng. & Computer Science, MIT 4:00 O2.4: "Neural Network Recognizer for Hand-Written Zip Code Digits: Representations,, Algorithms, and Hardware," J.S. Denker, H.P. Graf, L.D. Jackel, R.E. Howard, W. Hubbard, D. Henderson, W.R. Gardner, H.S. Baird, I. Guyon, AT&T Bell Laboratories, Holmdel, NJ 4:30 O2.5: "ALVINN: An Autonomous Land Vehicle in a Neural Network," D.A. Pomerleau, Computer Science Department, Carnegie Mellon University 5:00 O2.6: "A Combined Multiple Neural Network Learning System for the Classification of, Mortgage Insurance Appli- cations and Prediction of Loan Performance," S. Ghosh, E.A. Collins, C. L. Scofield, Nestor Inc., Providence, RI 8:00 PM Poster Session I Wednesday, November 30, 1988 AM SESSION O3: Neurobiology Invited Talk 8:30 O3.1: "Cricket Wind Detection," John Miller, Depart- ment of Zoology, UC Berkeley Contributed Talks 9:10 O3.2: "A Passive, Shared Element Analog Electronic Cochlea," D. Feld, J. Eisenberg, E.R. Lewis, Department of Electrical Eng. & Computer Science, University of California, Berkeley 9:40 O3.3: "Neuronal Maps for Sensory-motor Control in the Barn Owl," C.D. Spence, J.C. Pearson, J.J. Gelfand, R.M. Peterson, W.E. Sullivan, David Sarnoff Research Ctr, Subsid- iary of SRI International, Princeton, NJ 10:10 Break 10:30 O3.4: "Simulating Cat Visual Cortex: Circuitry Under- lying Orientation Selectivity," U.J. Wehmeier, D.C. Van Essen, C. Koch, Division of Biology, California Institute of Technology 11:00 O3.5: Model of Ocular Dominance Column Formation: Ana- lytical and Computational, Results," K.D. Miller, J.B. Keller, M.P. Stryker, Department of Physiology, University of California, San Francisco 11:30 O3.6: "Modeling a Central Pattern Generator in Soft- ware and Hardware:, Tritonia in Sea Moss," S. Ryckebusch, C. Mead, J. M. Bower, Computational Neural Systems Program, Caltech 12:00 Poster Session Wednesday PM SESSION O4: Computational Structures Invited Talk 2:20 O4.1: "Symbol Processing in the Brain," Geoffrey Hinton, Computer Science Department, University of Toronto Contributed Talks 3:00 O4.2: "Towards a Fractal Basis for Artificial Intelli- gence," Jordan Pollack, New Mexico State University, Las Cruces, NM 3:30 O4.3: "Learning Sequential Structure In Simple Recur- rent Networks," D. Servan-Schreiber, A. Cleeremans, J.L. McClelland, Department of Psychology, Carnegie-Mellon Uni- versity 4:00 O4.4: "Short-Term Memory as a Metastable State "Neurolocator," A Model of Attention", V.I. Kryukov, Re- search Computer Center, USSR Academy of Sciences 4:30 O4.5: "Heterogeneous Neural Networks for Adaptive Be- havior in Dynamic Environments," R.D. Beer, H.J. Chiel, L.S. Sterling, Center for Automation and Intelligent Sys. Res., Case Western Reserve University, Cleveland, OH 5:00 O4.6: "A Link Between Markov Models and Multilayer Perceptions," H. Bourlard, C.J. Wellekens, Philips Research Laboratory, Brussels, Belgium 7:00 PM Conference Banquet 9:00 Plenary Speaker "Neural Architecture and Function," Valentino Braitenberg, Max Planck Institut fur Biologische Kybernetik, West Germany Thursday, December 1, 1988 AM SESSION O5: Applications Invited Talk 8:30 O5.1: "Robotics, Modularity, and Learning," Rodney Brooks, AI Lab, MIT Contributed Talks 9:10 O5.2: "The Local Nonlinear Inhibition Circuit," S. Ryckebusch, J. Lazzaro, M. Mahowald, California Institute of Technology, Pasadena, CA 9:40 O5.3: "An Analog Self-Organizing Neural Network Chip," J. Mann, S. Gilbert, Lincoln Laboratory, MIT, Lexington, MA 10:10 Break 10:30 O5.4: "Performance of a Stochastic Learning Micro- chip," J. Alspector, B. Gupta, R.B. Allen, Bellcore, Morristown, NJ 11:00 O5.5: "A Fast, New Synaptic Matrix For Optically Pro- grammed Neural Networks," C.D. Kornfeld, R.C. Frye, C.C. Wong, E.A. Rietman, AT&T Bell Laboratories, Murray Hill, NJ 11:30 O5.6: "Programmable Analog Pulse-Firing Neural Net- works," Alan F. Murray, Lionel Tarassenko, Alister Hamilton, Department of Electrical Engineering, University of Edinburgh Scotland, UK 12:00 Poster Session 3:00 PM Adjourn to Keystone for workshops CONFERENCE PROCEEDING: The collected papers of the conference, called "Advances in Neural Information Processing Systems", Volume 1, will be available starting April 1, 1989. To reserve a copy, contact Morgan Kaufman Publishers, Inc., Order Fulfillment Department, P.O. Box 50490, Palo Alto, CA 94303, or call (415) 965-4081. \f3HOTEL REGISTRATION:\f1 The meeting in Denver is at the Denver Sheraton Hotel. The Sheraton Denver Tech Center is a suburban hotel located in the southeast corridor of Colorado in the exclusive Denver Technological Center Business Park. The property is 18 miles from the Stapleton International Airport, 3 miles from Centennial Airport and easily accessible by taxi or local airport shuttle services. TRANSPORTATION: Air Travel to Denver: Stapleton Airport is one of the major hubs in the U.S. and is served by numerous carriers, including United Airlines which has direct flights or close connections from almost all of its cities to Denver. As the "Official Airline" of the conference, United has pledged to discount the fares for conference attendees to below that offered by any other carrier (and is also making free tickets available for a drawing at the end of the conference). For reservations and further details, call 1-800-521-4041 and refer to account number 405IA. Ground Transport to Denver: (scheduled bus and van) - Shuttle service is available from the Southeast Airporter every 30 minutes at Door 6 on the baggage claim level, one way = $7 and round trip = $10. Car Rental: We have an agreement with Hertz Rental at the Sheraton for $20/day, 150 free miles/day with 30 cents/mile for each additional mile. Refer questions and reservations to Kevin Kline at Hertz (1-800-654-3131). POST MEETING WORKSHOP: December 1 - 4, 1988 Registration for the workshop is separate from the conference. It includes 2 continental breakfasts and one banquet dinner. FORMAT: Small group workshops will be held in the morning (7:30 - 9:30) and afternoon (4:30 - 6:30). The entire group will gather in the evening (8:30 - 10:30 p.m.) to hear the workshop leaders' reports, have further discussion and socialize. Last year this was a very successful format and we will again be open to suggestions from partricipants. Examples of the topics to be covered are Rules and Connectionist Modeling, Advances in Speech Recognition, New Experimental Methods, (especially optical recording with voltage sensitive dyes), Comparison of Hidden Markov vs. Neural Network Models, Complexity Issues, Neural Network vs. Traditional Classifiers, Real Neurons as Compared with Formal Neurons, Applications to Temporal Sequence Problems. Workshop Location: Keystone Resort will be the site of Neural Information Processing Systems workshops. Keystone Mountain is 70 miles west of Denver and offers the finest early season skiing in the Nation. Keystone Resort is a full service resort featuring world class skiing in addition to amenities including 11 swimming pools, indoor tennis, ice skating, 15 restaurants and a Village. Early December at Keystone provides an outstanding skiing value. IEEE has been able to secure to the following lodging rates. Accommodations: "Keystone Lodge": Single - 1 person $ 69.00 Double - 2 people $ 79.00 Condominiums Studio Condominium - 1 to 2 people $105.00 1 Bedroom Condominium - 1 to 4 people $130.00 These rates are per night and do not include 5.2% sales tax or 4.9% local surcharge. Please add $12.00 per person for persons over the stated levels. Attendance will be limited, and rooms not reserved by October 15th will be released back to the Keystone Resort. Keystone will extend discounted group lift tickets to IEEE workshop attendees for $17 per day. In addition, Keystone offers Night Skiing on over 200 acres of lighted terrain for $9 when extending your day ticket. Accommodations at Keystone may be reserved by calling 1-800-222-0188. When making your reservation, please refer to the group code DZ0IEEE to obtain the special conference rates. Transportation is available from the meeting site (the Sheraton Denver Tech Center) to Keystone and then from Keystone to Denver Stapleton International Airport, for $23 one way, or $46 round trip. Reservations can be made by completing the reservation form or by calling Keystone Resort at 1-800-451-5930. Hertz Rental cars are available at the Sheraton Denver Tech Center. Weekend rates are available. To reserve a Hertz rental car, call 1-800-654-3131. For those driving to Keystone, follow I-70 west to Exit 205, then take Highway 6 five miles west to Keystone. Word about Skiing at Summit County in December Keystone is joined by three other ski areas to compose Ski The Summit, Arapahoe Basic, Breckenridge, and Copper Mountain. All the areas are within 1/2 hour of each other and can be accessed by using the complimentary Summit Stage. Keystone features the largest snowmaking system in the West and traditionally has mid season conditions by December 1st. A Keystone lift ticket offers access to the three mountains; Keystone, the expert challenge of North Peak, and the Legend Arapahoe Basin. Keystone offers complete resort facilities including ski rental, ski school, nursery and mountain restaurants. Early December in Colorado is known as time for good skiing and minimal lift lines. From ajr at DSL.ENG.CAM.AC.UK Wed Oct 12 11:29:55 1988 From: ajr at DSL.ENG.CAM.AC.UK (Tony Robinson) Date: Wed, 12 Oct 88 11:29:55 BST Subject: Abstract Message-ID: <5917.8810121029@dsl.eng.cam.ac.uk> For those people who did not attend the nEuro'88 connectionists conference in Paris, our contribution is now available, abstract included below. Tony Robinson PS: Remember, reprint requests should be sent to "ajr at dsl.eng.cam.ac.uk" and NOT "connectionists at q.cs.cmu.edu" ============================================================================== A DYNAMIC CONNECTIONIST MODEL FOR PHONEME RECOGNITION A J Robinson, F Fallside Cambridge University Engineering Department Trumpington Street, Cambridge, England ajr at dsl.eng.cam.ac.uk ABSTRACT This paper describes the use of two forms of error propagation net trained to ascribe phoneme labels to successive frames of speech from multiple speakers. The first form places a fixed length window over the speech and labels the central portion of the window. The second form uses a dynamic structure in which the successive frames of speech and state vector containing context information are used to generate the output label. The paper concludes that the dynamic structure gives a higher recognition rate both in comparison with the fixed context structure and with the established k nearest neighbour technique. ============================================================================ From terry at cs.jhu.edu Wed Oct 12 09:48:39 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Wed, 12 Oct 88 09:48:39 edt Subject: NIPS phone numbers Message-ID: <8810121348.AA23558@crabcake.cs.jhu.edu> Rooms should be reserved at the Denver Tech Center (800-552-7030) and at Keystone (800-222-0188). There are a limited number reserved for the meeting at a discount rate on a first-come first-serve basis. A tentative list of topics and leaders for the post- conference workshop at Keystone includes: chair topic H. Gigley Rules and Connectionist Modelling A. Waibel Speech, especially NN's vs HMM's A. Grinvald Imaging techniques in Neurobiology I. Parberry Computational Complexity Issues M. Carter Fault Tolerance in Neural Networks C. Lau Evaluation Criteria for Neural Network Demonstrations If you want to chair a session contact Scott Kirkpatrick, program chairman: KIRK at ibm.com Terry ----- From pratt at paul.rutgers.edu Wed Oct 12 12:16:40 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 12 Oct 88 12:16:40 EDT Subject: Eric Baum to speak on formal results for generalization in neural nets Message-ID: <8810121616.AA21774@paul.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers What Size Net Gives Valid Generalization? ----------------------------------------- Eric B. Baum Jet Propulsion Laboratory California Institute of Technology Pasadena, CA. 91109 Room 705 Hill center, Busch Campus Friday October 28, 1988 at 11:10 am Refreshments served before the talk Abstract We address the question of when a network can be expected to generalize from m random training examples chosen from some arbitrary probability distribution, assuming that future test examples are drawn from the same distribution. Among our results are the following bounds on appropriate sample vs. network size. Assume 0 < e <= 1/8. We show that if m >= O( WlogN/e log(1/e)) examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 - e/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 1 - e of future test examples drawn from the same distribution. Conversely, for fully-connected feedforward nets with one hidden layer, any learning algorithm using fewer than Omega(W/e) random training examples will, for some distributions of examples consistent with an appropriate weight choice, fail at least some fixed fraction of the time to find a weight choice that will correctly classify more than a 1 - e fraction of the future test examples. From martha at cs.utexas.edu Thu Oct 13 12:45:36 1988 From: martha at cs.utexas.edu (Martha Weinberg) Date: Thu, 13 Oct 88 11:45:36 CDT Subject: Neural Computation Message-ID: <8810131645.AA22270@ratliff.cs.utexas.edu> From pratt at paul.rutgers.edu Thu Oct 13 15:23:39 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Thu, 13 Oct 88 15:23:39 EDT Subject: Cary Kornfeld to speak on hardware for neural nets & bitmapped graphics Message-ID: <8810131923.AA24651@zztop.rutgers.edu> Fall, 1988 Neural Networks Colloquium Series at Rutgers Bitmap Graphics and Neural Networks ----------------------------------- Cary Kornfeld AT&T Bell Laboratories Room 705 Hill center, Busch Campus Monday October 31, 1988 at 11:00 AM NOTE DAY AND TIME ARE DIFFERENT FROM USUAL Refreshments served before the talk From the perspective of system architecture and hardware design, bitmap graphics and neural networks are surprisingly alike. I will describe two key components of a graphics processor, designed and fabricated at Xerox PARC, this engine is based on Leo Guiba's Bitmap Calculus. While implementing that machine I got interested in building tiny, experimental flat panel displays. In the second part of this talk, I will describe a few of the early prototypes and (if facilities per- mit), will show a short video clip of their operation. When I arrived at Bell Labs three years ago I began building larger display panels using amorphous silicon, thin film transistors on glass substrates. It was this display work that gave birth to the idea of fabricating large neural networks using light sensitive synaptic elements. In May of this year we demonstrated working prototypes of these arrays in an ex- perimental neuro-computer at the Atlanta COMDEX show. This is one of the first neuro-computers built and is among the largest. Each of its 14,000 synapses is independently programmable over a continuous range of connection strength that can theoretically span more than five orders of magnitude (we've measured about three in our first-generation arrays). The computer has an animated, graphical user interface that en- ables the operator to both monitor and control its operation. This machine is "programmed" to solve a pattern reconstruction problem. (Again, facilities permitting) I will show a video tape of its operation and will demonstrate the user interface on a color SUN 3. From alexis at marzipan.mitre.org Fri Oct 14 10:28:14 1988 From: alexis at marzipan.mitre.org (Alexis Wieland) Date: Fri, 14 Oct 88 10:28:14 EDT Subject: Hopfield Paper ?? Message-ID: <8810141428.AA01379@marzipan.mitre.org.> I am interested in a pointer to a Hopfield (and Tank?) paper which I though existed and haven't been able to find. J.J. Hopfield has spoken about feed-forward nets described by his standard diff.eq.'s for accepting time-varying inputs. As I remember this system basically fed into a "Hopfield Net," but it's the time- varying feed-forward part that I am most interested in. My recollection is that there was an '86 paper on this, and I most recently heard about this at the AAAI Stanford workshop last spring. Any pointers to this or related work will be greatly appreciated. thanks in advance, Alexis Wieland wieland at mitre.arpa <= please use this address, just REPLYing may get confused From pollack at toto.cis.ohio-state.edu Fri Oct 14 15:12:37 1988 From: pollack at toto.cis.ohio-state.edu (Jordan B. Pollack) Date: Fri, 14 Oct 88 15:12:37 EDT Subject: tech report available In-Reply-To: Kathy Farrelly's message of 12 October 1988 1458-PDT (Wednesday) <8810122158.AA17267@sdics.ICS> Message-ID: <8810141912.AA01623@toto.cis.ohio-state.edu> Please and Thank you in advance, for a copy of Williams & Zipser's TR Jordan Pollack CIS Dept. OSU 2036 Neil Ave Columbus, OH 43210 From pollack at toto.cis.ohio-state.edu Fri Oct 14 15:12:37 1988 From: pollack at toto.cis.ohio-state.edu (Jordan B. Pollack) Date: Fri, 14 Oct 88 15:12:37 EDT Subject: tech report available In-Reply-To: Kathy Farrelly's message of 12 October 1988 1458-PDT (Wednesday) <8810122158.AA17267@sdics.ICS> Message-ID: <8810141912.AA01623@toto.cis.ohio-state.edu> Please and Thank you in advance, for a copy of Williams & Zipser's TR Jordan Pollack CIS Dept. OSU 2036 Neil Ave Columbus, OH 43210 From mcvax!bion.kth.se!orjan at uunet.UU.NET Thu Oct 13 04:48:35 1988 From: mcvax!bion.kth.se!orjan at uunet.UU.NET (Orjan Ekeberg) Date: Thu, 13 Oct 88 09:48:35 +0100 Subject: Paper from nEuro'88 in Paris Message-ID: <8810130848.AA23937@bogart.bion.kth.se> The following paper, presented at the nEuro'88 conference in Paris, has been sent for publication in the proceedings. Reprint requests can be sent to orjan at bion.kth.se (no cc to connectionists at ... please). =============== AUTOMATIC GENERATION OF INTERNAL REPRESENTATIONS IN A PROBABILISTIC ARTIFICIAL NEURAL NETWORK Orjan Ekeberg, Anders Lansner Department of Numerical Analysis and Computing Science The Royal Institute of Technology, S-100 44 Stockholm, Sweden ABSTRACT In a one layer feedback perceptron type network, the connections can be viewed as coding the pairwise correlations between activity in the corresponding units. This can then be used to make statistical inference by means of a relaxation technique based on bayesian inferences. When such a network fails, it might be because the regularities are not visible as pairwise correlations. One cure would then be to use a different internal coding where selected higher order correlations are explicitly represented. A method for generating this representation automatically is presented with a special focus on the networks ability to generalize properly. From pratt at paul.rutgers.edu Tue Oct 18 09:49:23 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 18 Oct 88 09:49:23 EDT Subject: Seminar: A Connectionst Framework for visual recognition Message-ID: <8810181349.AA01080@zztop.rutgers.edu> I saw this posted locally, thought you might like to attend. Don't forget Josh Alspector's talk this Friday (10/21) on his Boltzmann chip! Rutgers University CAIP (Center for Computer Aids for Industiral Productivity. (CAIP) Seminar: A Connectionist Framework for Visual Recognition Ruud Bolle Exploratory Computer Vision Group IBM Thomas J. Watson Research Center Abstract This talk will focus on the organization and implementation of a vision system to recognize 3D objects. The visual world being modeled is assumed to consist of objects that can be represented by planar patches, patches of quadrics of revolution, and the intersection curves of those quadric surfaces. A significant portion of man-made objects can be represented using such prmitives. One of the contributions of this vision system is that fundamentally different feature types, like survface and curve descriptions, and simultaneously extracted and combined to index into a database of objects. The input to the system is a depth map of a scene comprising of one or more objects. From the depth map, surface parameters and surface intersection/object limb parameters are extracted. Parameter extraction is modeled as a set of layered and concurrent parameter space transforms. Any one transform computes only a partial geometric description that forms the input to a next transform. The final transform is a mapping into an object database, which can be viewed as the highest-level of confidence for geomeetric descriptoins and 3D objects within the parameter spaces. The approach is motivated by connectionist model of visual recognition systems. Date: Friday, November 4, 1988 Time: 3:00 PM Place: Conference room 139, CAIP center, Busch Campus For information, call (201) 932-3443 From pratt at paul.rutgers.edu Tue Oct 18 09:59:36 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Tue, 18 Oct 88 09:59:36 EDT Subject: Test message, please ignore. Message-ID: <8810181359.AA01171@zztop.rutgers.edu> From terry at cs.jhu.edu Tue Oct 18 18:00:21 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Tue, 18 Oct 88 18:00:21 edt Subject: NIPS Student Travel Awards Message-ID: <8810182200.AA09780@crabcake.cs.jhu.edu> We have around 80 student applications for travel awards for the NIPS meeting in November. All students who are presenting an oral paper or poster will receive $250-500 depending on their expenses. Other students that have applied will very likely receive at least $250 --- but this depends on what the registration looks like at the end of the month. Official letters will go out on November 1. The deadline for 30 day supersaver fares is coming up soon. There is a $200+ savings for staying over Saturday night, so students who want to go to the workshop can actually save travel money by doing so. Terry ----- From INAM%MCGILLB.BITNET at VMA.CC.CMU.EDU Wed Oct 19 22:18:00 1988 From: INAM%MCGILLB.BITNET at VMA.CC.CMU.EDU (INAM000) Date: WED 19 OCT 1988 21:18:00 EST Subject: Outlets for theoretical work Message-ID: Department of Psychology, McGill University, 1205 Avenue Dr. Penfield, Montreal, Quebec, CANADA H3Y 2L2 October 20,1988 Dear Connectionists, Given the recent resurgence of formal analysis of "neural networks" (e.g. White,Gallant,Geman,Hanson,Burr),and the difficulty some people seem to have in finding an appropriate outlet for this work,I would like to remind researchers of the existence of the Journal of Mathematical Psychology.This is an Academic Press Journal that has been in existence for over 20 years,and is quite open to all kinds of mathematical papers in "theoretical" (i.e. mathematical, logical,computational) "psychology" (broadly interpreted). If you want further details regarding the Journal,or feedback about the appropriateness of a particular article,you can contact me by E-mail or telephone (514-398-6128),or contact the Editor directly-J.Townsend,Department of Psychology,Pierce Hall,Rm. 365A,West Lafayette,IND 47907. (E-Mail:KC at BRAZIL.PSYCH.PURDUE.EDU; Telephone: 317-494-6236).The address for manuscript submission is: Journal of Mathematical Psychology, Editorial Office,Seventh Floor, 1250 Sixth Avenue, San Diego, CA 92101. Regards Tony A.A.J.Marley Professor Book Review Editor,Journal of Mathematical Psychology E-MAIL - INAM at MUSICB.MCGILL.CA From MJ_CARTE%UNHH.BITNET at VMA.CC.CMU.EDU Mon Oct 24 11:13:00 1988 From: MJ_CARTE%UNHH.BITNET at VMA.CC.CMU.EDU (MJ_CARTE%UNHH.BITNET@VMA.CC.CMU.EDU) Date: Mon, 24 Oct 88 11:13 EDT Subject: Room mate sought for NIPS/Post-Mtg. Workshop Message-ID: I'm a new faculty member on a tight travel budget, and I'm looking for a room mate to split lodging expenses with at both the NIPS conference and at the post-meeting workshop (need not be the same person at each location). Please reply via e-mail to or call me at (603) 862-1357. Mike Carter Intelligent Structures Group University of New Hampshire From gluck at psych.Stanford.EDU Mon Oct 24 13:07:57 1988 From: gluck at psych.Stanford.EDU (Mark Gluck) Date: Mon, 24 Oct 88 10:07:57 PDT Subject: Reprints avail. Message-ID: Reprints of the following two papers are available by netrequest to gluck at psych.stanford.edu or by writing: Mark Gluck, Dept. of Psychology, Jordan Hall; Bldg. 420, Stanford Univ., Stanford, CA 94305. Gluck, M. A., & Bower, G. H. (1988) From conditioning to category learning: An adaptive network model. Journal of Experimental Psychology: General, V. 117, N. 3, 227-247 Abstract -------- We used adaptive network theory to extend the Rescorla-Wagner (1972) least mean squares (LMS) model of associative learning to phenomena of human learning and judgment. In three experiments subjects learned to categorize hypothetical patients with particular symptom patterns as having certain diseases. When one disease is far more likely than another, the model predicts that subjects will sub- stantially overestimate the diagnosticity of the more valid symptom for the rare disease. The results of Experiments 1 and 2 provide clear support for this prediction in contradistinction to predictions from probability matching, exemplar retrieval, or simple prototype learning models. Experiment 3 contrasted the adaptive network model with one predicting pattern-probability matching when patients always had four symptoms (chosen from four opponent pairs) rather than the presence or absence of each of four symptoms, as in Experiment 1. The results again support the Rescorla-Wagner LMS learning rule as embedded within an adaptive network. Gluck, M. A., Parker, D. B., & Reifsnider, E. (1988) Some biological implications of a differential-Hebbian learning rule. Psychobiology, Vol. 16(3), 298-302 Abstract -------- Klopf (1988) presents a formal real-time model of classical conditioning which generates a wide range of behavioral Pavlovian phenomena. We describe a replication of his simulation results and summarize some of the strengths and shortcomings of the drive- reinforcement model as a real-time behavioral model of classical conditioning. To facilitate further comparison of Klopf's model with neuronal capabilities, we present a pulse-coded reformulation of the model that is more stable and easier to compute than the original, frequency-based model. We then review three ancillary assumptions to the model's learning algorithm, noting that each can be seen as dually motivated by both behavioral and biological considerations. From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Oct 25 21:24:04 1988 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 25 Oct 88 21:24:04 EDT Subject: Political science viewed as a Neural Net :-) Message-ID: I was thinking about the upcoming U.S. election today, and it occurred to me that the seemingly useless electoral college mandated by the U.S. constitution might actually be of some value. A direct democratic election is basically a threshold decision function with lots of inputs and with fixed weights; add the electoral college and you've got a layered network with fifty hidden units, each with a non-linear threshold function. A direct election can only choose a winner based on some linearly separable function of voter opinions. You would expect to see complex issues forcibly projected onto some crude 1-D scale (e.g. "liberal" vs. "conservative" or "wimp" vs. "macho"). With a multi-layer decision network the system should be capable of performing a more complex separation of the feature space. Though they lacked the sophisticated mathematical theories available today, the designers of our constitution must have sensed the severe computational limitations of direct democracy and opted for the more complex decision system. Unfortunately, we do not seem to be getting the full benefit of this added flexibility. What the founding fathers left out of this multi-layer network is a mechanism for adjusting the weights in the network based on how well the decision ultimately turned out. Perhaps some form of back-propagation would work here. It might be hard to agree on a proper error measure, but the idea seems worth exploring. For example, everyone who voted for Nixon in 1972 should have the weight of his his future votes reduced by epsilon; a large momentum term would be added to the reduction for those people who had voted for Nixon previously. The reduction would be greater for voters in states where the decision was close (if any such states can be found). There is already a mechanism in place for altering the output weights of the hidden units: those states that correlate positively with the ultimate decision end up with more political "clout", then with more defense-related jobs. This leads to an influx of people and ultimately to more electoral votes for that state. Some sort of weight-decay term would be needed to prevent a runaway process in which all of the people end up in California. We might also want to add more cross-connections in the network. At present, each voter affects only one hidden unit, the state where he resides. This somewhat limits the flexibility of the learning process in assigning arbitrary functions to the hidden units. To fix this, we could allow voters to register in more than one state. George Bush has five or six home states; why not make this option available to all voters? More theoretical analysis of this complex system is needed. Perhaps NSF should fund a center for this kind of thinking. The picture is clouded by the observation that individual voters are not simple predicates: most of them have a rudimentary capacity for simple inference and in some cases they even exhibit a form of short-term learning. However, these minor perturbations probably cancel out on the average, and can be treated as noise in the decision units. Perhaps the amount of noise can be manipulated to give a crude approximation to simulated annealing. -- Scott From pratt at paul.rutgers.edu Wed Oct 26 09:15:09 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 26 Oct 88 09:15:09 EDT Subject: Neural networks, intelligent machines, and the AI wall: Jack Gelfand Message-ID: <8810261315.AA10043@paul.rutgers.edu> This looks like it'll be an especially interesting and controversial talk for our department. I hope you all can make it! --Lori Fall, 1988 Neural Networks Colloquium Series at Rutgers NEURAL NETWORKS, INTELLIGENT MACHINES AND THE AI WALL Jack Gelfand The David Sarnoff Research Center SRI International Princeton,N.J. Room 705 Hill center, Busch Campus Friday November 4, 1988 at 11:10 am Refreshments served before the talk When we look back at the last 25 years of AI research, we find that there have been many new techniques which have promised to produce intelligent machines for real world applications. Though the performance of some of these machines is quite extraordinary, very few have approached the performance of human beings for even the most rudimentary tasks. We believe that this is due to the fact that these methods have been largely monolithic, whereas biological systems approach these problems by combining many different modes of processing into integrated systems. A number of real and artificial neural network systems will be discussed in terms of how knowledge is represented, combined and processed in order to solve complex problems. From pratt at paul.rutgers.edu Wed Oct 26 12:27:42 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Wed, 26 Oct 88 12:27:42 EDT Subject: Looking for references to work on connectionism & databases Message-ID: <8810261627.AA01025@zztop.rutgers.edu> I am interested in work which might relate neural networks to databases in any way. Please send me any relevant references. Thanks, Lori From JRICE at cs.tcd.ie Thu Oct 27 06:23:00 1988 From: JRICE at cs.tcd.ie (JRICE@cs.tcd.ie) Date: Thu, 27 Oct 88 10:23 GMT Subject: Looking for references on connectionism and Protein Structure Message-ID: I am interested in applying connectionism to the generation of protein secondary and tertiary structure from the primary amino acid sequence. Could you please send me any relevent references. Thanks, John. From terry Thu Oct 27 09:19:18 1988 From: terry (Terry Sejnowski ) Date: Thu, 27 Oct 88 09:19:18 edt Subject: Looking for references on connectionism and Protein Structure Message-ID: <8810271319.AA05074@crabcake.cs.jhu.edu> The best existing method for predicting the secondary structure of a globular protein is a neural network: Qian, N. and Sejnowski, T. J. (1988) Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology 202, 865-884. Our results also indicate that only marginal improvements on our performance will be possible with local methods. Tertiary (3-D) structure is a much more difficult problem for which there are no good methods. Terry ----- From steeg at ai.toronto.edu Thu Oct 27 11:13:54 1988 From: steeg at ai.toronto.edu (Evan W. Steeg) Date: Thu, 27 Oct 88 11:13:54 EDT Subject: Looking for references on connectionism and Protein Structure Message-ID: <88Oct27.111408edt.7119@neat.ai.toronto.edu> In addition to the Qian & Sejnowski work, a potent illustration of the capabilities of (even today's rather simplistic) neural nets, there is: Bohr, N., Bohr, J., Brunak, S., Cotterill, R.M.J., Lautrup, B., Norskov, L., Olsen, O.H., and Petersen, S.B. (of the Bohr Institute and Tech. Univ. of Denmark), "Revealing Protein Structure by Neural Networks", presented as a poster at the Fourth International Symposium on Biological and Artificial Intelligence Systems, Trento, Italy 1988. Presumably, a paper will be published shortly. They use a feed-forward net and back-propagation, like Qian and Sejnowski, but use separate nets for each of 3 kinds of protein secondary structure, rather than a single net, and there are other methodological differences as well. Others, myself included, are using neural net techniques which exploit global (from the whole molecule), in addition to local interactions. This should, as Dr. Sejnowski pointed out, lead to more accurate structure prediction. Results of this work will begin to appear within a couple of months. -- Evan Evan W. Steeg (416) 978-7321 steeg at ai.toronto.edu (CSnet,UUCP,Bitnet) Dept of Computer Science steeg at ai.utoronto (other Bitnet) University of Toronto, steeg at ai.toronto.cdn (EAN X.400) Toronto, Canada M5S 1A4 {seismo,watmath}!ai.toronto.edu!steeg From hinton at ai.toronto.edu Thu Oct 27 15:39:44 1988 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Thu, 27 Oct 88 15:39:44 EDT Subject: How to ensure that back-propagation separates separable patterns Message-ID: <88Oct27.130054edt.6408@neat.ai.toronto.edu> We have recently obtained the following results, and would like to know if anyone has seen them previously written up. The description is informal but accurate: There has been some interest in understanding the behavior of backpropagation in feedforward nets with no hidden neurons. This is a particularly simple case that does not require the full power of back-propagation (it can also be approached from the point of view of perceptrons), but we believe that it provides useful insights into the structure of the error surface, and that understanding this case is a prerequisite for understanding the more general case with hidden layers. In [1], the authors give examples illustrating the fact that while a training set may be separable, a net performing backpropagation (gradient) search may get stuck in a solution which fails to separate. The objective of this note is to point out that if one uses instead a threshold procedure, where we do not penalize values ``beyond'' the targets, then such counterexamples cease to exist, and in fact that one has a convergence theorem that closely parallels that for perceptrons: the continuous gradient adjustment procedure is such that, from any initial weight configuration, a separating set of weights is obtained in finite time. We also show how to modify the example given in [2] to conclude that there is a training set consisting of 125 binary vectors and a network configuration for which there are nonglobal local minima, even if threshold LMS is used. In this example, the training set is of course not separable. Finally, we compare our results to more classical pattern recognition theorems, in particular to the convergence of the relaxation procedure for threshold LMS with linear output units ([3]), and show that the continuous adjustment approach is essential in the study of the nonlinear case. Another essential difference with the linear case is that in the latter nonglobal local minima cannot happen even if the data is not separable. References: [1] Brady, M., R. Raghavan and J. Slawny, ``Backpropagation fails to separate where perceptrons succeed,'' submitted for publication. Summarized version in ``Gradient descent fails to separate,'' in {\it Proc. IEEE International Conference on Neural Networks}, San Diego, California, July 1988, Vol. I, pp.649-656. [2] Sontag, E.D., and H.J. Sussmann, ``Backpropagation can give rise to spurious local minima even for networks without hidden layers,'' submitted. [3] Duda, R.O., and P.E. Hart, {\it Pattern Classificationa and Scene Analysis}, Wiley, New York, 1973. Geoff Hinton Eduardo Sontag Hector Sussman From rpl at ll-sst.arpa Fri Oct 28 09:42:23 1988 From: rpl at ll-sst.arpa (Richard Lippmann) Date: Fri, 28 Oct 88 09:42:23 EDT Subject: reply on back-propagation fails to separate paper Message-ID: <8810281342.AA03049@ll-sst.arpa> Geoff, We came up with the same conclusion a while ago when some people were worried about the performance of back propagation but never published it. Back propagation with limits seems to converge correctly for those contrived deterministic cases where minimizing total squared error does not minimize the percent patterns classified correctly. The limits cause the algorithm to change from an LMS mean-square minimizing approach to perceptron-like error corrective approach. Typically, however, the difference in percent patterns classified correctly between the local and global solutions in those cases tends to be small. In practice, we found that convergence for the one contrived case we tested with limits took rather a long time. I have never seen this published and it would be good to see your result published with a convergence proof. I have also seen little published on the effect of limits on performance of classifiers or on final weight values. Rich From Mark.Derthick at MCC.COM Fri Oct 28 14:27:00 1988 From: Mark.Derthick at MCC.COM (Mark.Derthick@MCC.COM) Date: Fri, 28 Oct 88 13:27 CDT Subject: TR available Message-ID: <19881028182725.0.DERTHICK@THORN.ACA.MCC.COM> For copies of my thesis, ask copetas at cs.cmu.edu for CMU-CS-88-182 "Mundane Reasoning by Parallel Constraint Satisfaction." I am 1200 miles away from the reports, so asking me doesn't do you any good: Mark Derthick MCC 3500 West Balcones Center Drive Austin, TX 78759 (512)338-3724 Derthick at MCC.COM If you have previously asked me for this report, it should be arriving soon. There aren't many extra copies right now, so requests to copetas may be delayed for a while. ABSTRACT Connectionist networks are well suited to everyday common sense reasoning. Their ability to simultaneously satisfy multiple soft constraints allows them to select from conflicting information in finding a plausible interpretation of a situation. However these networks are poor at reasoning using the standard semantics of classical logic, based on truth in all possible models. This thesis shows that using an alternate semantics, based on truth in a single most plausible model, there is an elegant mapping from theories expressed using the syntax of propositional logic onto connectionist networks. An extension of this mapping to allow for limited use of quantifiers suffices to build a network from knowledge bases expressed in a frame language similar to KL-ONE. Although finding optimal models of these theories is intractable, the networks admit a fast hill climbing search algorithm that can be tuned to give satisfactory answers in familiar situations. The Role Shift problem illustrates the potential of this approach to harmonize conflicting information, using structured distributed representations. Although this example works well, much remains before realistic domains are feasible. From mehra at aquinas.csl.uiuc.edu Fri Oct 28 12:44:02 1988 From: mehra at aquinas.csl.uiuc.edu (Pankaj Mehra) Date: Fri, 28 Oct 88 11:44:02 CDT Subject: reply on back-propagation fails to separate paper Message-ID: <8810281644.AA06760@aquinas> Hi everybody. When I heard Brady et al.'s talk at ICNN-88, I thought that the results simply pointed out that a correct approach to classification may not give equal importance to all training samples. As is well-known, classical back-prop converges to a separating surface that depends on the LMS error summed uniformly over all training samples. I think that the new results provide a case for attaching more importance to the elements on concept boundaries. I have been working on this problem (of trying to characterize "boundary" elements) off and on, without much success. Basically, geometric characterizations exist but they are too complex to evaluate. What is interesting, however, is the fact that complexity of learning (hence, the time for convergence) depend on the nature of the separating surface. Theoretical results also involve similar concepts, e.g. VC-dimension. Also notice that if one could somehow "learn" the characteristic of boundary elements, then one could ignore a large part of the training sample and still converge properly using a threshold procedure like that suggested in Geoff's note. Lastly, since back-prop is not constrained to always use LMS as the error function, one wonders if there is an intelligent method (that can be automated) for constructing error functions based on the complexity of the separating surface. - Pankaj Mehra {mehra%aquinas at uxc.cso.uiuc.edu} From pratt at paul.rutgers.edu Fri Oct 28 16:54:11 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Fri, 28 Oct 88 16:54:11 EDT Subject: Production rule LHS pattern matching with neural nets Message-ID: <8810282054.AA02887@zztop.rutgers.edu> Hello, I'm interested in any work involving matching the left-hand side of a production rule using a neural network. I imagine that one could use a representation language which isn't first order logic which could be used for the LHS, also and which might be more amenable to a neural network approach. Thanks (again!) for any pointers, Lori ------------------------------------------------------------------- Lorien Y. Pratt Computer Science Department pratt at paul.rutgers.edu Rutgers University Busch Campus (201) 932-4634 Piscataway, NJ 08854 From mkon at bu-cs.BU.EDU Fri Oct 28 19:35:30 1988 From: mkon at bu-cs.BU.EDU (mkon@bu-cs.BU.EDU) Date: Fri, 28 Oct 88 19:35:30 EDT Subject: reply on back-propagation fails to separate paper In-Reply-To: Pankaj Mehra's message of Fri, 28 Oct 88 11:44:02 CDT <8810281644.AA06760@aquinas> Message-ID: <8810282335.AA24173@bucsd.bu.edu> I would appreciate a preprint or any related information (say, other references) related to the discussion you presented on the connectionist network today. Thanks in advance. Mark A. Kon Department of Mathematics Boston University Boston, MA 02215 mkon at bu-cs.bu.edu From sontag at fermat.rutgers.edu Sat Oct 29 10:50:52 1988 From: sontag at fermat.rutgers.edu (Eduardo Sontag) Date: Sat, 29 Oct 88 10:50:52 EDT Subject: reply on back-propagation fails to separate paper In-Reply-To: <8810282335.AA24173@bucsd.bu.edu> (mkon@bu-cs.bu.edu) Message-ID: <8810291450.AA21199@control.rutgers.edu> Mark, Re your question to Mehra about complexity of boundaries and VC dimension, we just had a talk at Rutgers yesterday by Eric Baum (baum at pupgg.princeton.edu) about this. You should ask him for copies of his papers on the subject, which also contain references to Valiant's and other related work. I think that the relation between Brady et.al. and our results can be explained better in terms of threshold vs nonthreshold costs than in terms of relative weightings of terms. -eduardo Eduardo D. Sontag Rutgers Center for Systems and Control (SYCON) Rutgers University (sontag at fermat.rutgers.edu) From ajr at DSL.ENG.CAM.AC.UK Mon Oct 31 06:14:50 1988 From: ajr at DSL.ENG.CAM.AC.UK (Tony Robinson) Date: Mon, 31 Oct 88 11:14:50 GMT Subject: Tech report available Message-ID: <2402.8810311114@dsl.eng.cam.ac.uk> Here is the summary of a tech report which demonstates that the error propagation algorithm is not limited to weighted-sum type nodes, but can be used to train radial-basis-function type nodes and others. Send me some email if you would like a copy. Tony. P.S. If you asked for a copy of my/our last paper, I've taken the liberty of sending you a hard copy of this one as well. Thank you for replying to ajr at dsl.eng.cam.ac.uk not connectionists at ... `'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`' Generalising the Nodes of the Error Propagation Network CUED/F-INFENG/TR.25 A J Robinson, M Niranjan, F Fallside Cambridge University Engineering Department Trumpington Street, Cambridge, England email: ajr at uk.ac.cam.eng.dsl 1 November 1988 Gradient descent has been used with much success to train connectionist models in the form of the Error Propagation Network (Rumelhart Hinton and Williams, 1986). In these nets the output of a node is a non-linear function of the weighted sum of the activations of other nodes. This type of node defines a hyper-plane in the input space, but other types of nodes are possible. For example, the Kanerva Model (Kanerva 1984), the Modified Kanerva Model (Prager and Fallside 1988), networks of Spherical Graded Units (Hanson and Burr, 1987), networks of Localised Receptive Fields (Moody and Darken, 1988) and the method of Radial Basis Functions (Powell, 1985; Broomhead and Lowe 1988) all use nodes which define volumes in the input space. Niranjan and Fallside (1988) summarise these and compare the class boundaries formed by this family of networks with feed-forward networks and nearest neighbour classifiers. This report shows that the error propagation algorithm can be used to train general types of node. The example of a gaussian node is given and this is compared with other connectionist models for the problem of recognition of steady state vowels from multiple speakers. From pratt at paul.rutgers.edu Mon Oct 31 12:57:09 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Mon, 31 Oct 88 12:57:09 EST Subject: Paul Thagard to speak on Analogical thinking Message-ID: <8810311757.AA04514@zztop.rutgers.edu> COGNITIVE PSYCHOLOGY FALL COLLOQUIUM SERIES (Rutgers University) Date: 9 November 1988 Time: 4:30 PM Place: Room 307, Psychology Building, Busch Campus Paul Thagard, Cognitive Science Program, Princeton University ANALOGICAL THINKING Analogy is currently a very active area of research in both cognitive psychology and artificial intelligence. Keith Holyoak and I have developed connectionist models of analogical retrieval and mapping that are consistent with the results of psychological experiments. The new models use localist networks to simultaneously satisfy a set of semantic, structural, and pragmatic constraints. After providing a general view of analogical thinking, this talk will describe our model of analog retrieval. From pratt at paul.rutgers.edu Mon Oct 31 13:13:18 1988 From: pratt at paul.rutgers.edu (Lorien Y. Pratt) Date: Mon, 31 Oct 88 13:13:18 EST Subject: Schedule of remaining talks this semester Message-ID: <8810311813.AA04698@zztop.rutgers.edu> Speaker schedule as of 10/31/88 for end of the semester talks in the Fall, 1988 Neural Networks Colloquium Series at Rutgers. Speaker Date Title ------- ---- ----- Jack Gelfand 11/4/88 Neural networks, Intelligent Machines, and the AI wall Mark Jones 11/11/88 Knowledge representation in connectionist networks, including inheritance reasoning and default logic. E. Tzanakou 11/18/88 ALOPEX: Another optimization method Stefan Shrier 12/9/88 Abduction Machines for Grammar Discovery From cpd at CS.UCLA.EDU Mon Oct 31 16:28:05 1988 From: cpd at CS.UCLA.EDU (Charles Dolan) Date: Mon, 31 Oct 88 13:28:05 PST Subject: Tech report on connectionist knowledge processing Message-ID: <881031.212805z.29548.cpd@oahu.cs.ucla.edu> Implementing a connectionist production system using tensor products September, 1988 UCLA-AI-88-15 CU-CS-411-88 Charles P. Dolan Paul Smolensky AI Center Department of Computer Science & Hughes Research Labs Institute of Cognitive Science 3011 Malibu Canyon Rd. University of Colorado Malibu, CA 90265 Boulder, CO 80309-0430 & UCLA AI Laboratory Abstract In this paper we show that the tensor product technique for constructing variable bindings and for representing symbolic structure-used by Dolan and Dyer (1987) in parts of a connectionist story understanding model, and analyzed in general terms in Smolensky (1987)-can be effectively used to build a simplified version of Touretzky & Hinton's (1988) Distributed Connectionist Production System. The new system is called the Tensor Product Product System (TPPS). Copyright c 1988 by Charles Dolan & Paul Smolensky. For copies send a message to valerie at cs.ucla.edu at UCLA or kate at boulder.colorado.edu Boulder From VIJAYKUMAR at cs.umass.EDU Mon Oct 31 14:57:00 1988 From: VIJAYKUMAR at cs.umass.EDU (Vijaykumar Gullapalli 545-1596) Date: Mon, 31 Oct 88 15:57 EDT Subject: Tech. Report available Message-ID: <8810312059.AA10948@crash.cs.umass.edu> The following Tech. Report is available. Requests should be sent to "SMITH at cs.umass.edu". A Stochastic Algorithm for Learning Real-valued Functions via Reinforcement Feedback Vijaykumar Gullapalli COINS Technical Report 88-91 University of Massachusetts Amherst, MA 01003 ABSTRACT Reinforcement learning is the process by which the probability of the response of a system to a stimulus increases with reward and decreases with punishment. Most of the research in reinforcement learning (with the exception of the work in function optimization) has been on problems with discrete action spaces, in which the learning system chooses one of a finite number of possible actions. However, many control problems require the application of continuous control signals. In this paper, we present a stochastic reinforcement learning algorithm for learning functions with continuous outputs. Our algorithm is designed to be implemented as a unit in a connectionist network. We assume that the learning system computes its real-valued output as some function of a random activation generated using the Normal distribution. The activation at any time depends on the two parameters, the mean and the standard deviation, used in the Normal distribution, which, in turn, depend on the current inputs to the unit. Learning takes place by using our algorithm to adjust these two parameters so as to increase the probability of producing the optimal real value for each input pattern. The performance of the algorithm is studied by using it to learn tasks of varying levels of difficulty. Further, as an example of a potential application, we present a network incorporating these real-valued units that learns the inverse kinematic transform of a simulated 3 degree-of-freedom robot arm. From terry at cs.jhu.edu Mon Oct 31 19:35:09 1988 From: terry at cs.jhu.edu (Terry Sejnowski ) Date: Mon, 31 Oct 88 19:35:09 est Subject: reply on back-propagation fails to separate paper Message-ID: <8811010035.AA17520@crabcake.cs.jhu.edu> The proceedings of the CMU Connectionist Models Summer School has two papers on optimal choice of training set based on "critical" or "boundary" patterns: Karen Huyser on boolean functions and Ahmad Subatai on the majority function. The proceedings are available from Morgan Kaufmann. Terry ----- From neural!jsd at ihnp4.att.com Sun Oct 30 00:08:28 1988 From: neural!jsd at ihnp4.att.com (neural!jsd@ihnp4.att.com) Date: Sun, 30 Oct 88 00:08:28 EDT Subject: We noticed LMS fails to separate Message-ID: <8810300407.AA13067@neural.UUCP> Yes, we noticed that a Least-Mean-Squares (LMS) network even with no hidden units fails to separate some problems. Ben Wittner spoke at the IEEE NIPS meeting in Denver, November 1987, describing TWO failings of this type. He gave an example of a situation in which LMS algorithms (including those commonly referred to as back-prop) are metastable, i.e. they fail to separate the data for certain initial configurations of the weights. He went on to describe another case in which the algorithm actually leaves the solution region after starting within it. He also pointed out that this can lead to learning sessions in which the categorization performance of back-prop nets (with or without hidden units) is not a monotonically improving function of learning time. Finally, he presented a couple of ways of modifying the algorithm to get around these problems, and proved a convergence theorem for the modified algorithms. One of the key ideas is something that has been mentioned in several recent postings, namely, to have zero penalty when the training pattern is well-classified or "beyond". We cited Minsky & Papert as well as Duda & Hart; we believe they were more-or-less aware of these bugs in LMS, although they never presented explicit examples of the failure modes. Here is the abstract of our paper in the proceedings, _Neural Information Processing Systems -- Natural and Synthetic_, Denver, Colorado, November 8-12, 1987, Dana Anderson Ed., AIP Press. We posted the abstract back in January '88, but apparently it didn't get through to everybody. Reprints of the whole paper are available. Strategies for Teaching Layered Networks Classification Tasks Ben S. Wittner (1) John S. Denker AT&T Bell Laboratories Holmdel, New Jersey 07733 ABSTRACT: There is a widespread misconception that the delta-rule is in some sense guaranteed to work on networks without hidden units. As previous authors have mentioned, there is no such guarantee for classification tasks. We will begin by presenting explicit counter-examples illustrating two different interesting ways in which the delta rule can fail. We go on to provide conditions which do guarantee that gradient descent will successfully train networks without hidden units to perform two-category classification tasks. We discuss the generalization of our ideas to networks with hidden units and to multi-category classification tasks. (1) Currently at NYNEX Science and Technology / 500 Westchester Ave. White Plains, NY 10604