From swain at cs.rochester.edu Tue Aug 1 12:03:00 1989 From: swain at cs.rochester.edu (swain@cs.rochester.edu) Date: Tue, 1 Aug 89 12:03:00 EDT Subject: Ph.D. thesis available Message-ID: <8908011603.AA05194@rigel.cs.rochester.edu> The following Ph.D. thesis now available: PARALLEL OBJECT RECOGNITION FROM STRUCTURE (THE TINKERTOY PROJECT) Paul R. Cooper Department of Computer Science University of Rochester Technical Report 301 July 1989 Abstract: This thesis examines the problem of recognizing structurally composed objects. The task is the recognition of Tinkertoys --- objects whose identity is defined solely by the spatial relationships between simple parts. Ultimately, a massively parallel framework incorporating a principled treatment of uncertainty and domain dependence is developed to address the problem. The basic architecture of the solution is formed by posing structure matching as a part-wise correspondence problem in a labelling framework, then applying the unit/value principle. The solution is developed incrementally. Complexity and correctness analyses and implementation experiments are provided at each phase. In the first phase, a special purpose network implementing discrete connectionist relaxation is used to topologically discriminate between objects. In the second step, the algorithm is generalized to a massively parallel formulation of constraint satisfaction, yielding an arc consistency algorithm with the fastest known time complexity. At this stage the formulation of the application problem is also generalized, so geometric discrimination can be achieved. Developing an implementation required defining a method for the domain specific optimization of the parallel arc consistency algorithm. The optimization method is applicable to arbitrary domains. In the final phase, the solution is generalized to handle uncertain input information and statistical domain dependence. Segmentation and recognition are computed simultaneously by a coupled Markov Random Field. Both problems are posed as labelling problems within a unified high-level MRF architecture. In the segmentation subnet, evidence from the image is combined with clique potentials expressing both qualitative {\em a priori} constraints and learnable domain dependent knowledge. Matching constraints and coupling constraints complete the definition of the field. The effectiveness of the framework is demonstrated in experiments involving the traditionally difficult problems of occlusion and accidental alignment. ============ TO ORDER, send requests to tr at cs.rochester.edu or physical mail to: Technical Reports Librarian, Department of Computer Science, University of Rochester, Rochester, NY 14627. The cost is $7.25. Make checks payable to the University of Rochester. From movellan at garnet.berkeley.edu Tue Aug 1 13:41:15 1989 From: movellan at garnet.berkeley.edu (movellan@garnet.berkeley.edu) Date: Tue, 1 Aug 89 10:41:15 PDT Subject: International Joint Conference on Neural Networks Message-ID: <8908011741.AA11138@garnet.berkeley.edu> Does anybody know the address for paper submissions to the next IJCNN ? From harnad at clarity.Princeton.EDU Fri Aug 4 00:48:48 1989 From: harnad at clarity.Princeton.EDU (Stevan Harnad) Date: Fri, 4 Aug 89 00:48:48 EDT Subject: Tech Report: THe Symbol Grounding Problem Message-ID: <8908040448.AA08344@psycho.Princeton.EDU> THE SYMBOL GROUNDING PROBLEM Stevan Harnad Department of Psychology Princeton University ABSTRACT: There has been much discussion recently about the scope and limits of purely symbolic models of the mind and about the proper role of connectionism in cognitive modeling. This paper describes the "symbol grounding problem" for a semantically interpretable symbol system: How can its semantic interpretation be made intrinsic to the symbol system, rather than just parasitic on the meanings in our heads? How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols? The problem is analogous to trying to learn Chinese from a Chinese/Chinese dictionary alone. A candidate solution is sketched: Symbolic representations must be grounded bottom-up in nonsymbolic representations of two kinds: (1) iconic representations, which are analogs of the proximal sensory projections of distal objects and events, and (2) categorical representations, which are learned and innate feature-detectors that pick out the invariant features of object and event categories from their sensory projections. Elementary symbols are the names of these object and event categories, assigned on the basis of their (nonsymbolic) categorical representations. Higher-order (3) symbolic representations, grounded in these elementary symbols, consist of symbol strings describing category membership relations ("An X is a Y that is Z"). Connectionism is one natural candidate for the mechanism that learns the invariant features underlying categorical representations, thereby connecting names to the proximal projections of the distal objects they stand for. In this way connectionism can be seen as a complementary component in a hybrid nonsymbolic/symbolic model of the mind, rather than a rival to purely symbolic modeling. Such a hybrid model would not have an autonomous symbolic "module," however; the symbolic functions would emerge as an intrinsically "dedicated" symbol system as a consequence of the bottom-up grounding of categories' names in their sensory representations. Symbol manipulation would be governed not just by the arbitrary shapes of the symbol tokens, but by the nonarbitrary shapes of the icons and category invariants in which they are grounded. Preprint Available Stevan Harnad JVNET harnad at confidence.princeton.edu harnad at princeton.edu srh at flash.bellcore.com harnad at elbereth.rutgers.edu From CLIFF%ATC%atc.bendix.com at RELAY.CS.NET Tue Aug 15 12:04:00 1989 From: CLIFF%ATC%atc.bendix.com at RELAY.CS.NET (CLIFF%ATC%atc.bendix.com@RELAY.CS.NET) Date: Tue, 15 Aug 89 11:04 EST Subject: No subject Message-ID: Is anyone aware of public domain network simulators written in C (particularly VAX or PC-based)? We would prefer to avoid a major duplication of effort. Any responses will be summarized and posted. Thanks in advance, Pat Coleman (pat at atc.bendix.com) From hinton at ai.toronto.edu Mon Aug 14 13:23:20 1989 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Mon, 14 Aug 89 13:23:20 EDT Subject: Toronto postdoc position filled Message-ID: <89Aug14.132403edt.10806@ephemeral.ai.toronto.edu> The job in Toronto that was advertised on the connectionist mailing list has been filled. Geoff From sankar at caip.rutgers.edu Thu Aug 10 12:37:07 1989 From: sankar at caip.rutgers.edu (ananth sankar) Date: Thu, 10 Aug 89 12:37:07 EDT Subject: neural nets/machine vision Message-ID: <8908101637.AA21037@caip.rutgers.edu> I would appreciate it if you could refer me via personal mail to any ongoing research using neural nets for machine vision tasks. Any Ph.D. thesis/ abstracts of thesis will also be in order. Finally I am interested in combinations of Self Organizing and Supervised Learning Algorithms using neural nets with continuous valued input/output. Please send mail to sankar at caip.rutgers.edu Thanks in advance and in anticipation.. Ananth Sankar From cole at cse.ogc.edu Thu Aug 17 21:04:33 1989 From: cole at cse.ogc.edu (Ron Cole) Date: Thu, 17 Aug 89 18:04:33 -0700 Subject: Try it, you'll like it Message-ID: <8908180104.AA25855@cse.ogc.edu> Tired of playing with learning and momentum constants? Tired of waiting for weeks for your network to converge? Try OPT, a simulator that uses conjugate gradient optimization to train fully connected feedforward networks with backpropagation. OPT is written in C. It is remarkably easy to use. To obtain opt: 1. ftp to cse.ogc.edu 2. login as "anonymous" with any password 3. cd to "/ogc2/guest/ftp/pub/nnvowels" 4. Look in README. OPT was written by Etienne Barnard at Carnegie-Mellon University. For more complete documentation, including figures describing conjugate gradient optimization, send mail to vincew at cse.ogc.edu. Enjoy -- Ron From olson at cs.rochester.edu Fri Aug 18 11:35:00 1989 From: olson at cs.rochester.edu (olson@cs.rochester.edu) Date: Fri, 18 Aug 89 11:35:00 EDT Subject: Ph.D. thesis available Message-ID: <8908181535.AA16631@ash.cs.rochester.edu> The following Ph.D. thesis now available: AN ARCHITECTURAL MODEL OF VISUAL MOTION UNDERSTANDING Thomas J. Olson Department of Computer Science University of Rochester Technical Report 305 August 1989 Abstract: The past few years have seen an explosion of interest in the recovery and use of visual motion information by biological and machine vision systems. In the area of computer vision, a variety of algorithms have been developed for extracting various types of motion information from images. Neuroscientists have made great strides in understanding the flow of motion information from the retina to striate and extrastriate cortex. The psychophysics community has gone a long way toward characterizing the limits and structure of human motion processing. The central claim of this thesis is that many puzzling aspects of motion perception can be understood by assuming a particular architecture for the human motion processing system. The architecture consists of three functional units or subsystems. The first or low-level subsystem computes simple mathematical properties of the visual signal. It is entirely bottom-up, and prone to error when its implicit assumptions are violated. The intermediate-level subsystem combines the low-level system's output with world knowledge, segmentation information and other inputs to construct a representation of the world in terms of primitive forms and their trajectories. It is claimed to be the substrate for long-range apparent motion. The highest level of the motion system assembles intermediate-level form and motion primitives into scenarios that can be used for prediction and for matching against stored models. The lowest level of the architecture is in accord with standard models of early motion perception, and details of the highest level are being worked out in related thesis work by Nigel Goddard. The secondary contribution of this thesis is a detailed connectionist model of the intermediate level of the architecture. In order to compute the trajectories of primitive shapes it is necessary to design mechanisms for handling time and Gestalt grouping effects in connectionist networks. Solutions to these problems are developed and used to construct a network that interprets continuous and apparent motion stimuli in a limited domain. Simulation results show that its interpretations are in qualitative agreement with human perception. ============ TO ORDER, send requests to tr at cs.rochester.edu or physical mail to: Technical Reports Librarian, Department of Computer Science, University of Rochester, Rochester, NY 14627. The cost is $7.25. Make checks payable to the University of Rochester. From mozer at neuron.Colorado.EDU Tue Aug 22 11:06:07 1989 From: mozer at neuron.Colorado.EDU (Michael C. Mozer) Date: Tue, 22 Aug 89 09:06:07 MDT Subject: tech report available Message-ID: <8908221506.AA03615@neuron> Please send reprint requests to "conn_tech_report at boulder.colorado.edu". On the Interaction of Selective Attention and Lexical Knowledge: A Connectionist Account of Neglect Dyslexia Mike Mozer & Marlene Behrmann Tech Report CU-CS-441-89 Neglect dyslexia, a reading impairment acquired as a consequence of brain injury, is traditionally interpreted as a disturbance of selective attention. Patients with neglect dyslexia may ignore the left side of an open book, the beginning words of a line of text, or the beginning letters of a single word. These patients provide a rich but sometimes contradictory source of data regarding the locus of attentional selectivity. We have reconsidered the patient data within the framework of an existing connectionist model of word recognition and spatial attention. We show that the effects of damage to the model resemble the reading impairments observed in neglect dyslexia. In simulation experiments, we account for a broad spectrum of behaviours including the following: (1) when two noncontiguous stimuli are presented simultaneously, the contralesional stimulus is neglected (extinction); (2) explicit instructions to the patient can reduce the severity of neglect; (3) stimulus position in the visual field affects reading performance; (4) words are read much better than pronounceable nonwords; (5) the nature of error responses depends on the morphemic composition of the stimulus; and (6) extinction interacts with lexical knowledge (if two words are presented that form a compound, e.g., COW and BOY, the patient is more likely to report both than in a control condition, e.g., SUN and FLY). The convergence of findings from the neuropsychological research and the computational modelling sheds light on the role of attention in normal visuospatial processing, supporting a hybrid view of attentional selection that has properties of both early and late selection. **************** PLEASE DO NOT REDISTRIBUTE TO OTHER BBOARDS ***************** From tenorio at ee.ecn.purdue.edu Tue Aug 22 14:20:21 1989 From: tenorio at ee.ecn.purdue.edu (Manoel Fernando Tenorio) Date: Tue, 22 Aug 89 13:20:21 EST Subject: reports from PURDUE Message-ID: <8908221820.AA25198@ee.ecn.purdue.edu> Bcc: -------- The SONN algorithm report is now going to be distributed free, thanks to the overwhelming response of hundreds on the net and our good administration, willing to quickly review outdated practices to accomodate the sharing spirit present in the net. The cost of such publication for any institutions can be burdensome. I would advise my colleagues to whenever possible practice money saving strategies such as local redistribution of hard copies. This is to avoid in the future no one being able to pay the report price tag. I also would like to suggest the setting up of a few (one?) electronic standard for report formatting so that electronic sharing can be viable. Problably we should take a vote on this. Those that asked for the report, sit tight, it is coming... --ft. From INS_ATGE%JHUVMS.BITNET at VMA.CC.CMU.EDU Tue Aug 22 19:23:00 1989 From: INS_ATGE%JHUVMS.BITNET at VMA.CC.CMU.EDU (INS_ATGE%JHUVMS.BITNET@VMA.CC.CMU.EDU) Date: Tue, 22 Aug 89 18:23 EST Subject: Neural Net Archive? Message-ID: Has anyone considered (or possibly created) an archive site for trained neural networks? Why spend thousands of epochs of learning to create you network weights, only to throw them away after your research is done? If anyone feels such an archive site may be of use, please send me email (as it would be helpful to me as I lobby for a site at Hopkins). -Thomas Edwards ins_atge at jhuvms.bitnet ins_atge at jhunix.hxf.jhu.edu tedwards at cmsun.nrl.navy.mil From gabriel at icsib8.Berkeley.EDU Tue Aug 22 18:36:16 1989 From: gabriel at icsib8.Berkeley.EDU (Gabriel Cristobal) Date: Tue, 22 Aug 89 15:36:16 PDT Subject: addme request Message-ID: <8908222236.AA16669@icsib8.> Would you be so kind to add me in the connectionist list. Thanks a lot. From munnari!batserver.cs.uq.oz.au!mav at uunet.UU.NET Wed Aug 23 10:01:50 1989 From: munnari!batserver.cs.uq.oz.au!mav at uunet.UU.NET (Simon Dennis) Date: Wed, 23 Aug 89 09:01:50 EST Subject: tech report available Message-ID: <8908222301.6551@munnari.oz.au> I beginning some work on multilayer reinforcement learning. Could anyone currently doing work or who knows of work in this area please mail me with references etc. Thanx Simon Dennis From pollack at cis.ohio-state.edu Wed Aug 23 20:19:13 1989 From: pollack at cis.ohio-state.edu (Jordan B Pollack) Date: Wed, 23 Aug 89 20:19:13 EDT Subject: Tech Reports Available by FTP Message-ID: <8908240019.AA00867@toto.cis.ohio-state.edu> **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** It is really quite costly, in fact, to make 200 copies of big papers and mail them off around the world, especially since that is what journals make profit by doing! The original idea was not for self-publicity, but as an advanced circulation announcement for friends and colleagues. Here is a self-organizing solution which moves the cost of paper to the requestor and the cost of postage to the spare capacity of the network! We can start a database of technical reports in postscript (or perhaps TeX, if you're not worried about prosecurity). Local sources have donated a few tens of megabytes to the connectionist cause, in the directory pub/neuroprose on the host cheops.cis.ohio-state.edu You can PUT or GET articles in a few seconds or minutes, tex them if necessary, and zip them off on your nearest postscript laser printer. Of course, it helps if all of figures are in place, appended, or in a separate but closely named postscript file. For example, I am hereby announcing the FTP availability of postscript versions of a couple of articles in the publishing pipeline. Note the naming convention, which should evolve from the rather simple scheme of (author,subject,filetype): pollack.newraam.ps latest revision of RAAM paper pollack.perceptrons.ps widely advertised bookreview to appear in JMathPsych Here is how you can get them: ftp cheops.cis.ohio-state.edu (or, ftp 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> ls pollack.newraam.ps pollack.perceptrons.ps ftp> get (remote-file) pollack.newraam.ps (local-file) foo.ps 261245 bytes sent in 9.9 seconds (26 Kbytes/s) ftp> get (remote-file) pollack.perceptrons.ps (local-file) bar.ps 65413 bytes sent in 3.5 seconds (18 Kbytes/s) ftp> quit unix> lpr *.ps Please put your own TR's there and announce their availability to the mailing list. It is certainly more work than just hitting "R", but probably worth it, all around. Except, perhaps, to whomever pays for network bandwidth. We could even eventually submit journal articles this way... Jordan **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** From elman at amos.ucsd.edu Thu Aug 24 11:58:23 1989 From: elman at amos.ucsd.edu (Jeff Elman) Date: Thu, 24 Aug 89 08:58:23 PDT Subject: CRL TR 8903: Representation and structure in connectionist models Message-ID: <8908241558.AA00824@amos.ucsd.edu> -------------------------------------------------- CRL Tech Report 8903 August 1989 "Representation and Structure in Connectionist Models" Jeffrey L. Elman Departments of Cognitive Science and Linguistics University of California, San Diego ABSTRACT This paper focuses on the nature of representations in connectionist models. It addresses two issues: (1) Can connectionist models develop representations which possess internal structure and which provide the basis for productive and systematic behavior; and (2) Can representations which are fundamentally context-sensitive support grammatical behavior which appears to be abstract and general? Results from two simulations are reported. The simulations address problems in the distinction between type and token, the representation of lexical categories, and the representation of grammatical structure. The results suggest that connectionist representations can indeed have such internal structure and exhibit systematic behavior, and that a mechanism which is sensitive to context is capable of capturing generalizations of varying degrees of abstractness. -------------------------------------------------- Copies may be requested by sending your name and address to yvonne at amos.ucsd.edu From tedwards at cmsun.nrl.navy.mil Mon Aug 28 14:38:31 1989 From: tedwards at cmsun.nrl.navy.mil (Thomas Edwards) Date: Mon, 28 Aug 89 14:38:31 EDT Subject: Connections Per Second Message-ID: <8908281838.AA08753@cmsun.nrl.navy.mil> I have been wondering about interconnections per second ratings. Suppose: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second? Is my claim still acceptable if I am actually performing the neural network calculations by systollic array matrix multiplication? I'll be the first to admit that interconnections per second is a speed measure which does not neccessarily reflect the reality of the neural processing system it is measured on, but I've just been reading some Cray claims that the X-MP (1 CPU) should be able to do 50 million interconnections per second, and the Connection Machine is capable of only 13 million (which seems fairly slow compared to some of the benchmarks I have run). -Thomas Edwards ins_atge at jhunix.hcf.jhu.edu tedwards at cmsun.nrl.navy.mil ins_atge at jhuvms.BITNET From chrisley at arisia.xerox.com Mon Aug 28 17:51:36 1989 From: chrisley at arisia.xerox.com (Ron Chrisley UNTIL 10/3/88) Date: Mon, 28 Aug 89 14:51:36 PDT Subject: Connections Per Second Message-ID: <8908282151.AA25599@kanga.parc.xerox.com> Thomas Edwards wrote: "I have been wondering about interconnections per second ratings. Suppose: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second?" I don't know about the rest of your calculations, or how anyone else does this kind of comparison, but you should probably add in your bias weights if you have any. That should be about 192 more weights. Why did you not include both layers of weights for the back prop contribution to the total? Also, I think some people distinguish between CUPS (connection updates per second) ratings during learning, and CUPS ratings during recall. Make sure you aren't comparing apples and oranges... Ron Chrisley From singer at Think.COM Mon Aug 28 21:13:39 1989 From: singer at Think.COM (singer@Think.COM) Date: Mon, 28 Aug 89 21:13:39 EDT Subject: Connections Per Second In-Reply-To: Thomas Edwards's message of Mon, 28 Aug 89 14:38:31 EDT <8908281838.AA08753@cmsun.nrl.navy.mil> Message-ID: <8908290113.AA04579@kulla.think.com> Date: Mon, 28 Aug 89 14:38:31 EDT From: Thomas Edwards I have been wondering about interconnections per second ratings. Suppose: [...] reading some Cray claims that the X-MP (1 CPU) should be able to do 50 million interconnections per second, and the Connection Machine is capable of only 13 million (which seems fairly slow compared to some of the benchmarks I have run). With all due respect to the issues of benchmarking neural networks and with no intention of trivializing this intellectual forum with timings, I feel obligated by my corporate loyalties to mention that the last neural network timings done on a Connection Machine were 80 million weight updates per second (paper to be presented at NIPS '89). Alexander Singer Thinking Machines Corp. singer at think.com From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Aug 29 00:55:42 1989 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 29 Aug 89 00:55:42 EDT Subject: Connections Per Second In-Reply-To: Your message of Mon, 28 Aug 89 14:38:31 -0400. <8908281838.AA08753@cmsun.nrl.navy.mil> Message-ID: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second? Is my claim still acceptable if I am actually performing the neural network calculations by systollic array matrix multiplication? I think that most of the CPS numbers that people quote are derived by multiplying the total number of connections in the net times the number pattern presentations divided by the total time required. You would not count a connection twice just because the backprop algorithm happens to use some or all of the weights twice. For your example, I get (64 * 129) + (128 * 65) = 16,576 connections (don't forget the bias connections), 6400 pattern presentations, and 41.54 seconds, for a grand total of 2.55 million CPS. At least, that's how I would compute these numbers. Your milage may differ. I suspect that some of the CPS numbers quoted for commercial neural net simulators were computed using some other formula. When I talk to people selling these simulators, I always try to pin them down on how their CPS numbers were computed, but the marketing types seldom know the gory details. I think that it is reasonable to report "equivalent" CPS numbers, even if your program actually does the computation in some other way -- computing many epochs in parallel, for example. -- Scott From beer at icsib.Berkeley.EDU Tue Aug 29 19:29:09 1989 From: beer at icsib.Berkeley.EDU (Joachim Beer) Date: Tue, 29 Aug 89 16:29:09 PDT Subject: Connections per Second Message-ID: <8908292329.AA11907@icsib.Berkeley.EDU> The recent discussion about CPS numbers made me wonder why we need such new metrics at all. Why not just record the execution time in seconds? Here is what I don't like about CPS numbers: * They don't facilitate the performance comparison of connectionist models and non-connectionist models (e.g. statistical pattern classifiers). Are we afraid of such comparisons? * If CPS numbers become the accepted yard stick with which to measure execution models then one is likely to penalize new approaches which are slower per connection but, overall, require fewer connections. * CPS numbers are very hard to interpret, as we have seen in the recent discussion about CPS. Every performance metric is open to abuse, but measuring performance in absolute execution time seems to me the least ambiguous metric. After all, the user is not interest in how many connections are updated per second but how long it will take to solve a (benchmark) problem. A similar development has taken place in the field of Logic Programming, there everything is measured in KLIPS (Kilo logical inference per second). This has led to such a confusion that it is now virtually impossible to fairly compare competing execution models (not to mention comparison with other programming paradigms). Let's hope this will not happen to connectionism. -Joachim Beer From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Aug 29 20:43:21 1989 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 29 Aug 89 20:43:21 EDT Subject: Connections per Second In-Reply-To: Your message of Tue, 29 Aug 89 16:29:09 -0700. <8908292329.AA11907@icsib.Berkeley.EDU> Message-ID: Why not just record the execution time in seconds? Because implementations that differ by several orders of magnitude in raw speed cannot be compared using the identical network. For some simulators it would take forever to measure the runtime of any net with more than 100 connections; for other implementations, that's not even enough to fill up the pipeline. As long as the time required by a given implementation grows more or less linearly with the number of connections in the network, CPS tells you something useful about that implementation: how long it will take to run some number of passes on a given network. And if two systems implement the same algorithm (e.g. standard backprop), then the CPS numbers give you a rough way of comparing them: one Warp equals three Commodore-64's, or whatever. * They don't facilitate the performance comparison of connectionist models and non-connectionist models (e.g. statistical pattern classifiers). Are we afraid of such comparisons? CPS numbers are clearly not useful for comparing two different neural net algorithms, unless they are very close to one another or unless there is some formal transformation from one algorithm into another. A Boltzmann CPS is very different from a backprop CPS or a Hopfield CPS. So of course there's no way to compare backprop to some statistical classifier using CPS numbers -- they are only good for comparing members of the same family. You're right, when it comes to comparing the speed of a given connectionist model against a statistical model (or two very different connectionist models), about the only way we can do it is to compare total runtime, on the same machine, programmed by equally competent hackers, and with the same degree of optimization for speed vs. flexibility. If any of these conditions doesn't hold, you can be off by a large factor, but still you might be able to use the result for a crude order-of-magnitude comparison of runtimes. * If CPS numbers become the accepted yard stick with which to measure execution models then one is likely to penalize new approaches which are slower per connection but, overall, require fewer connections. Usually the tradeoff is between the speed of a single training cycle and the number of cycles needed, not the size of the net. But in any case, I think there is no danger as long as we all realize that CPS is useless in comparing two significiantly different algorithms. * CPS numbers are very hard to interpret, as we have seen in the recent discussion about CPS. That's why we're having this discussion. Maybe we can agree on a common metric that is useful, at least in a very limited set of circumstances. I suspect that the CPS numbers currently being tossed around for backprop are ambiguous by a factor of two, and it would be nice to sort that out. -- Scott From chrisley at arisia.xerox.com Tue Aug 29 22:05:18 1989 From: chrisley at arisia.xerox.com (Ron Chrisley UNTIL 10/3/88) Date: Tue, 29 Aug 89 19:05:18 PDT Subject: Connections per Second Message-ID: <8908300205.AA04929@kanga.parc.xerox.com> Joachim Beer writes: "The recent discussion about CPS numbers made me wonder why we need such new metrics at all. Why not just record the execution time in seconds?... After all, the user is not interest in how many connections are updated per second but how long it will take to solve a (benchmark) problem." The CUPS comparison should be between *implementations* of standard connectionist networks, not between different models themselves. I don't know of any one comparing ART to backprop using CUPS, but I do see people comparing various network systems (like SAIC, HNC, Rochecter connectionist simulator on machine XYZ, etc) that implement standard models. This measurement, if the calculation procedure is standardized, is useful. You're right that the user wants to know how long it will take, in seconds, for the network to solve a problem, but they usually want to know how long it will take for the network to solve *their* problem, not a benchmark one. This is where CUPS ratings come in handy. If we *banned* [:-)] CUPS ratings, people would use them themselves anyway: "Let's see, if HNC told me that it takes 42.21 seconds for them to do 150 iterations of XOR learning with network configuration Z, then that means that they can do N CUPS, so that means that they could do 100 iterations of my problem with network configuration X in under 76 seconds." Of course, this says nothing about how long (in iterations) it takes to *solve* a particular problem. But reporting absolute times for benchmarks doesn't either (at least not until we have a good theory how different problems relate to each other in terms of learning difficulty. I should live so long... ?-) Ron Chrisley From hinton at ai.toronto.edu Wed Aug 30 10:34:38 1989 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Wed, 30 Aug 89 10:34:38 EDT Subject: Connections Per Second In-Reply-To: Your message of Tue, 29 Aug 89 09:18:00 -0400. Message-ID: <89Aug30.103535edt.11258@ephemeral.ai.toronto.edu> I agree with previous messages that the problem with reporting "connections per second" is that it could mean many different things: 1. CPS for just a forward pass (2 FLOPS per connection) 2. CPS for forward & backward passes (usually 6 FLOPS per connection) 3. CPS for forward & backward passes on connections from input units (4 FLOPS) 4. CPS for forward & backward passes with online weight updates (8 or 6 FLOPS). The obvious solution is to just report the megaflops, where the only operations that are counted are the floating point operations in the inner loops (i.e. we ignore any computations used to implement the sigmoid etc, but we include the weight updating if we are using the online method, so we use the numbers given above). I expect that unscrupulous salespersons will still try to cheat by counting operations like loads and stores and increments to loop counters in the megaflop rating, but at least the researchers will have a standard figure that can be used to compare machines and simulators. Geoff From beer%icsia2.Berkeley.EDU at berkeley.edu Wed Aug 30 13:26:20 1989 From: beer%icsia2.Berkeley.EDU at berkeley.edu (Joachim Beer) Date: Wed, 30 Aug 89 10:26:20 PDT Subject: Connections per Second In-Reply-To: Your message of Tue, 29 Aug 89 20:43:21 -0400. <8908300044.AA12383@icsib.Berkeley.EDU> Message-ID: <8908301726.AA00707@icsia2> I still believe execution time is a more versatile metric than CUPS. Of course, it all depends on what you want to benchmark. You might want to compare different algorithms or benchmark connectionist simulators/machines. It is clear that CUPS don't help you to compare different connectionist models. However, most simulators try to support a wide range of different connectionist models. Knowing their CUPS performance for standard backprop does not help me to asses their potential performance for different connectionist models. I would therefore like to see a connectionist benchmark set that incorporates as many different models as possible. I realize that this is probably asking to much, but, for example, even backprop can be implemented in many alternative ways: on-line, off-line, conjugated gradient methods, etc. For example, in order to evaluate a backprop simulator/machine I would like to measure the simulator on the above alternative implementation models, because they have slightly different computational requirements. Just standard (on-line?) backprop CUPS performance is not very informative for anything that aims to be more than just a standard backprop machine or simulator. To make it short, CUPS apply only to one precisely defined benchmark program (usually standard backprop). In my opinion this is to narrow a definition of a benchmark metric. Can one define a (artifical?) benchmark set that reflects as best as possible the operational and computational requirements of connectionist networks in general? Something the connectionist community could agree on. -Joachim From huyser at mithril.stanford.edu Wed Aug 30 13:43:53 1989 From: huyser at mithril.stanford.edu (Karen Huyser) Date: Wed, 30 Aug 89 10:43:53 PDT Subject: Connections per second Message-ID: <8908301743.AA15029@mithril.Stanford.EDU> Geoffrey Hinton writes: " The obvious solution is just to report the megaflops . . ." I disagree. To me, the obvious solution is to work to standardize an appropriate definition of CPS for each learning algorithm and to state the definition used in the calculation whenever the CPS measure is quoted. Karen Huyser From delta at csl.ncsu.edu Thu Aug 31 10:49:38 1989 From: delta at csl.ncsu.edu (Thomas Hildebrandt) Date: Thu, 31 Aug 89 10:49:38 EDT Subject: Desire Fukushima Network Simulator Message-ID: <8908311449.AA21098@csl> Dear All: I would like to investigate the work of Kazuhiko Fukushima more fully by duplicating his experiments as closely as possible. If anyone knows of the existence of a working simulator for his 'neocognitron' model, I would appreciate receiving information about it. Naturally, source code would be of inestimable value. Thanks. Thomas H. Hildebrandt (delta at csl36h.ncsu.edu) From INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU at murtoa.cs.mu.oz Tue Aug 22 19:23:00 1989 From: INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU at murtoa.cs.mu.oz (INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU@murtoa.cs.mu.oz) Date: Tue, 22 Aug 89 18:23 EST Subject: Neural Net Archive? Message-ID: Has anyone considered (or possibly created) an archive site for trained neural networks? Why spend thousands of epochs of learning to create you network weights, only to throw them away after your research is done? If anyone feels such an archive site may be of use, please send me email (as it would be helpful to me as I lobby for a site at Hopkins). -Thomas Edwards ins_atge at jhuvms.bitnet ins_atge at jhunix.hxf.jhu.edu tedwards at cmsun.nrl.navy.mil Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,862,620732672,600 691phys4.anucluster.cs.su.ozmailerID=send+Yzu1z0FF4,ccadfa=01d"71 From delta%csl.ncsu.edu at murtoa.cs.mu.oz Thu Aug 31 10:49:38 1989 From: delta%csl.ncsu.edu at murtoa.cs.mu.oz (Thomas Hildebrandt) Date: Thu, 31 Aug 89 10:49:38 EDT Subject: Desire Fukushima Network Simulator Message-ID: Dear All: I would like to investigate the work of Kazuhiko Fukushima more fully by duplicating his experiments as closely as possible. If anyone knows of the existence of a working simulator for his 'neocognitron' model, I would appreciate receiving information about it. Naturally, source code would be of inestimable value. Thanks. Thomas H. Hildebrandt (delta at csl36h.ncsu.edu) Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,833,620803067,600/q691phys4.anucluster.cs.su.ozmailerID=send+Z+9Dv1ThP,ccadfa=01Ms71 From huyser%mithril.stanford.edu at murtoa.cs.mu.oz Wed Aug 30 13:43:53 1989 From: huyser%mithril.stanford.edu at murtoa.cs.mu.oz (Karen Huyser) Date: Wed, 30 Aug 89 10:43:53 PDT Subject: Connections per second Message-ID: Geoffrey Hinton writes: " The obvious solution is just to report the megaflops . . ." I disagree. To me, the obvious solution is to work to standardize an appropriate definition of CPS for each learning algorithm and to state the definition used in the calculation whenever the CPS measure is quoted. Karen Huyser Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,736,620802835,600P]691phys4.anucluster.cs.su.ozmailerID=send+Z+9AH1TQC,ccadfa=01;71 From tenorio%ee.ecn.purdue.edu at murtoa.cs.mu.oz Tue Aug 22 14:20:21 1989 From: tenorio%ee.ecn.purdue.edu at murtoa.cs.mu.oz (Manoel Fernando Tenorio) Date: Tue, 22 Aug 89 13:20:21 EST Subject: reports from PURDUE Message-ID: Bcc: -------- The SONN algorithm report is now going to be distributed free, thanks to the overwhelming response of hundreds on the net and our good administration, willing to quickly review outdated practices to accomodate the sharing spirit present in the net. The cost of such publication for any institutions can be burdensome. I would advise my colleagues to whenever possible practice money saving strategies such as local redistribution of hard copies. This is to avoid in the future no one being able to pay the report price tag. I also would like to suggest the setting up of a few (one?) electronic standard for report formatting so that electronic sharing can be viable. Problably we should take a vote on this. Those that asked for the report, sit tight, it is coming... --ft. Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,1217,620732672,600z701phys4.anucluster.cs.su.ozmailerID=send+Yzu1z0FH7,ccadfa=01 at 071 From swain at cs.rochester.edu Tue Aug 1 12:03:00 1989 From: swain at cs.rochester.edu (swain@cs.rochester.edu) Date: Tue, 1 Aug 89 12:03:00 EDT Subject: Ph.D. thesis available Message-ID: <8908011603.AA05194@rigel.cs.rochester.edu> The following Ph.D. thesis now available: PARALLEL OBJECT RECOGNITION FROM STRUCTURE (THE TINKERTOY PROJECT) Paul R. Cooper Department of Computer Science University of Rochester Technical Report 301 July 1989 Abstract: This thesis examines the problem of recognizing structurally composed objects. The task is the recognition of Tinkertoys --- objects whose identity is defined solely by the spatial relationships between simple parts. Ultimately, a massively parallel framework incorporating a principled treatment of uncertainty and domain dependence is developed to address the problem. The basic architecture of the solution is formed by posing structure matching as a part-wise correspondence problem in a labelling framework, then applying the unit/value principle. The solution is developed incrementally. Complexity and correctness analyses and implementation experiments are provided at each phase. In the first phase, a special purpose network implementing discrete connectionist relaxation is used to topologically discriminate between objects. In the second step, the algorithm is generalized to a massively parallel formulation of constraint satisfaction, yielding an arc consistency algorithm with the fastest known time complexity. At this stage the formulation of the application problem is also generalized, so geometric discrimination can be achieved. Developing an implementation required defining a method for the domain specific optimization of the parallel arc consistency algorithm. The optimization method is applicable to arbitrary domains. In the final phase, the solution is generalized to handle uncertain input information and statistical domain dependence. Segmentation and recognition are computed simultaneously by a coupled Markov Random Field. Both problems are posed as labelling problems within a unified high-level MRF architecture. In the segmentation subnet, evidence from the image is combined with clique potentials expressing both qualitative {\em a priori} constraints and learnable domain dependent knowledge. Matching constraints and coupling constraints complete the definition of the field. The effectiveness of the framework is demonstrated in experiments involving the traditionally difficult problems of occlusion and accidental alignment. ============ TO ORDER, send requests to tr at cs.rochester.edu or physical mail to: Technical Reports Librarian, Department of Computer Science, University of Rochester, Rochester, NY 14627. The cost is $7.25. Make checks payable to the University of Rochester. From movellan at garnet.berkeley.edu Tue Aug 1 13:41:15 1989 From: movellan at garnet.berkeley.edu (movellan@garnet.berkeley.edu) Date: Tue, 1 Aug 89 10:41:15 PDT Subject: International Joint Conference on Neural Networks Message-ID: <8908011741.AA11138@garnet.berkeley.edu> Does anybody know the address for paper submissions to the next IJCNN ? From harnad at clarity.Princeton.EDU Fri Aug 4 00:48:48 1989 From: harnad at clarity.Princeton.EDU (Stevan Harnad) Date: Fri, 4 Aug 89 00:48:48 EDT Subject: Tech Report: THe Symbol Grounding Problem Message-ID: <8908040448.AA08344@psycho.Princeton.EDU> THE SYMBOL GROUNDING PROBLEM Stevan Harnad Department of Psychology Princeton University ABSTRACT: There has been much discussion recently about the scope and limits of purely symbolic models of the mind and about the proper role of connectionism in cognitive modeling. This paper describes the "symbol grounding problem" for a semantically interpretable symbol system: How can its semantic interpretation be made intrinsic to the symbol system, rather than just parasitic on the meanings in our heads? How can the meanings of the meaningless symbol tokens, manipulated solely on the basis of their (arbitrary) shapes, be grounded in anything but other meaningless symbols? The problem is analogous to trying to learn Chinese from a Chinese/Chinese dictionary alone. A candidate solution is sketched: Symbolic representations must be grounded bottom-up in nonsymbolic representations of two kinds: (1) iconic representations, which are analogs of the proximal sensory projections of distal objects and events, and (2) categorical representations, which are learned and innate feature-detectors that pick out the invariant features of object and event categories from their sensory projections. Elementary symbols are the names of these object and event categories, assigned on the basis of their (nonsymbolic) categorical representations. Higher-order (3) symbolic representations, grounded in these elementary symbols, consist of symbol strings describing category membership relations ("An X is a Y that is Z"). Connectionism is one natural candidate for the mechanism that learns the invariant features underlying categorical representations, thereby connecting names to the proximal projections of the distal objects they stand for. In this way connectionism can be seen as a complementary component in a hybrid nonsymbolic/symbolic model of the mind, rather than a rival to purely symbolic modeling. Such a hybrid model would not have an autonomous symbolic "module," however; the symbolic functions would emerge as an intrinsically "dedicated" symbol system as a consequence of the bottom-up grounding of categories' names in their sensory representations. Symbol manipulation would be governed not just by the arbitrary shapes of the symbol tokens, but by the nonarbitrary shapes of the icons and category invariants in which they are grounded. Preprint Available Stevan Harnad JVNET harnad at confidence.princeton.edu harnad at princeton.edu srh at flash.bellcore.com harnad at elbereth.rutgers.edu From CLIFF%ATC%atc.bendix.com at RELAY.CS.NET Tue Aug 15 12:04:00 1989 From: CLIFF%ATC%atc.bendix.com at RELAY.CS.NET (CLIFF%ATC%atc.bendix.com@RELAY.CS.NET) Date: Tue, 15 Aug 89 11:04 EST Subject: No subject Message-ID: Is anyone aware of public domain network simulators written in C (particularly VAX or PC-based)? We would prefer to avoid a major duplication of effort. Any responses will be summarized and posted. Thanks in advance, Pat Coleman (pat at atc.bendix.com) From hinton at ai.toronto.edu Mon Aug 14 13:23:20 1989 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Mon, 14 Aug 89 13:23:20 EDT Subject: Toronto postdoc position filled Message-ID: <89Aug14.132403edt.10806@ephemeral.ai.toronto.edu> The job in Toronto that was advertised on the connectionist mailing list has been filled. Geoff From sankar at caip.rutgers.edu Thu Aug 10 12:37:07 1989 From: sankar at caip.rutgers.edu (ananth sankar) Date: Thu, 10 Aug 89 12:37:07 EDT Subject: neural nets/machine vision Message-ID: <8908101637.AA21037@caip.rutgers.edu> I would appreciate it if you could refer me via personal mail to any ongoing research using neural nets for machine vision tasks. Any Ph.D. thesis/ abstracts of thesis will also be in order. Finally I am interested in combinations of Self Organizing and Supervised Learning Algorithms using neural nets with continuous valued input/output. Please send mail to sankar at caip.rutgers.edu Thanks in advance and in anticipation.. Ananth Sankar From cole at cse.ogc.edu Thu Aug 17 21:04:33 1989 From: cole at cse.ogc.edu (Ron Cole) Date: Thu, 17 Aug 89 18:04:33 -0700 Subject: Try it, you'll like it Message-ID: <8908180104.AA25855@cse.ogc.edu> Tired of playing with learning and momentum constants? Tired of waiting for weeks for your network to converge? Try OPT, a simulator that uses conjugate gradient optimization to train fully connected feedforward networks with backpropagation. OPT is written in C. It is remarkably easy to use. To obtain opt: 1. ftp to cse.ogc.edu 2. login as "anonymous" with any password 3. cd to "/ogc2/guest/ftp/pub/nnvowels" 4. Look in README. OPT was written by Etienne Barnard at Carnegie-Mellon University. For more complete documentation, including figures describing conjugate gradient optimization, send mail to vincew at cse.ogc.edu. Enjoy -- Ron From olson at cs.rochester.edu Fri Aug 18 11:35:00 1989 From: olson at cs.rochester.edu (olson@cs.rochester.edu) Date: Fri, 18 Aug 89 11:35:00 EDT Subject: Ph.D. thesis available Message-ID: <8908181535.AA16631@ash.cs.rochester.edu> The following Ph.D. thesis now available: AN ARCHITECTURAL MODEL OF VISUAL MOTION UNDERSTANDING Thomas J. Olson Department of Computer Science University of Rochester Technical Report 305 August 1989 Abstract: The past few years have seen an explosion of interest in the recovery and use of visual motion information by biological and machine vision systems. In the area of computer vision, a variety of algorithms have been developed for extracting various types of motion information from images. Neuroscientists have made great strides in understanding the flow of motion information from the retina to striate and extrastriate cortex. The psychophysics community has gone a long way toward characterizing the limits and structure of human motion processing. The central claim of this thesis is that many puzzling aspects of motion perception can be understood by assuming a particular architecture for the human motion processing system. The architecture consists of three functional units or subsystems. The first or low-level subsystem computes simple mathematical properties of the visual signal. It is entirely bottom-up, and prone to error when its implicit assumptions are violated. The intermediate-level subsystem combines the low-level system's output with world knowledge, segmentation information and other inputs to construct a representation of the world in terms of primitive forms and their trajectories. It is claimed to be the substrate for long-range apparent motion. The highest level of the motion system assembles intermediate-level form and motion primitives into scenarios that can be used for prediction and for matching against stored models. The lowest level of the architecture is in accord with standard models of early motion perception, and details of the highest level are being worked out in related thesis work by Nigel Goddard. The secondary contribution of this thesis is a detailed connectionist model of the intermediate level of the architecture. In order to compute the trajectories of primitive shapes it is necessary to design mechanisms for handling time and Gestalt grouping effects in connectionist networks. Solutions to these problems are developed and used to construct a network that interprets continuous and apparent motion stimuli in a limited domain. Simulation results show that its interpretations are in qualitative agreement with human perception. ============ TO ORDER, send requests to tr at cs.rochester.edu or physical mail to: Technical Reports Librarian, Department of Computer Science, University of Rochester, Rochester, NY 14627. The cost is $7.25. Make checks payable to the University of Rochester. From mozer at neuron.Colorado.EDU Tue Aug 22 11:06:07 1989 From: mozer at neuron.Colorado.EDU (Michael C. Mozer) Date: Tue, 22 Aug 89 09:06:07 MDT Subject: tech report available Message-ID: <8908221506.AA03615@neuron> Please send reprint requests to "conn_tech_report at boulder.colorado.edu". On the Interaction of Selective Attention and Lexical Knowledge: A Connectionist Account of Neglect Dyslexia Mike Mozer & Marlene Behrmann Tech Report CU-CS-441-89 Neglect dyslexia, a reading impairment acquired as a consequence of brain injury, is traditionally interpreted as a disturbance of selective attention. Patients with neglect dyslexia may ignore the left side of an open book, the beginning words of a line of text, or the beginning letters of a single word. These patients provide a rich but sometimes contradictory source of data regarding the locus of attentional selectivity. We have reconsidered the patient data within the framework of an existing connectionist model of word recognition and spatial attention. We show that the effects of damage to the model resemble the reading impairments observed in neglect dyslexia. In simulation experiments, we account for a broad spectrum of behaviours including the following: (1) when two noncontiguous stimuli are presented simultaneously, the contralesional stimulus is neglected (extinction); (2) explicit instructions to the patient can reduce the severity of neglect; (3) stimulus position in the visual field affects reading performance; (4) words are read much better than pronounceable nonwords; (5) the nature of error responses depends on the morphemic composition of the stimulus; and (6) extinction interacts with lexical knowledge (if two words are presented that form a compound, e.g., COW and BOY, the patient is more likely to report both than in a control condition, e.g., SUN and FLY). The convergence of findings from the neuropsychological research and the computational modelling sheds light on the role of attention in normal visuospatial processing, supporting a hybrid view of attentional selection that has properties of both early and late selection. **************** PLEASE DO NOT REDISTRIBUTE TO OTHER BBOARDS ***************** From tenorio at ee.ecn.purdue.edu Tue Aug 22 14:20:21 1989 From: tenorio at ee.ecn.purdue.edu (Manoel Fernando Tenorio) Date: Tue, 22 Aug 89 13:20:21 EST Subject: reports from PURDUE Message-ID: <8908221820.AA25198@ee.ecn.purdue.edu> Bcc: -------- The SONN algorithm report is now going to be distributed free, thanks to the overwhelming response of hundreds on the net and our good administration, willing to quickly review outdated practices to accomodate the sharing spirit present in the net. The cost of such publication for any institutions can be burdensome. I would advise my colleagues to whenever possible practice money saving strategies such as local redistribution of hard copies. This is to avoid in the future no one being able to pay the report price tag. I also would like to suggest the setting up of a few (one?) electronic standard for report formatting so that electronic sharing can be viable. Problably we should take a vote on this. Those that asked for the report, sit tight, it is coming... --ft. From INS_ATGE%JHUVMS.BITNET at VMA.CC.CMU.EDU Tue Aug 22 19:23:00 1989 From: INS_ATGE%JHUVMS.BITNET at VMA.CC.CMU.EDU (INS_ATGE%JHUVMS.BITNET@VMA.CC.CMU.EDU) Date: Tue, 22 Aug 89 18:23 EST Subject: Neural Net Archive? Message-ID: Has anyone considered (or possibly created) an archive site for trained neural networks? Why spend thousands of epochs of learning to create you network weights, only to throw them away after your research is done? If anyone feels such an archive site may be of use, please send me email (as it would be helpful to me as I lobby for a site at Hopkins). -Thomas Edwards ins_atge at jhuvms.bitnet ins_atge at jhunix.hxf.jhu.edu tedwards at cmsun.nrl.navy.mil From gabriel at icsib8.Berkeley.EDU Tue Aug 22 18:36:16 1989 From: gabriel at icsib8.Berkeley.EDU (Gabriel Cristobal) Date: Tue, 22 Aug 89 15:36:16 PDT Subject: addme request Message-ID: <8908222236.AA16669@icsib8.> Would you be so kind to add me in the connectionist list. Thanks a lot. From munnari!batserver.cs.uq.oz.au!mav at uunet.UU.NET Wed Aug 23 10:01:50 1989 From: munnari!batserver.cs.uq.oz.au!mav at uunet.UU.NET (Simon Dennis) Date: Wed, 23 Aug 89 09:01:50 EST Subject: tech report available Message-ID: <8908222301.6551@munnari.oz.au> I beginning some work on multilayer reinforcement learning. Could anyone currently doing work or who knows of work in this area please mail me with references etc. Thanx Simon Dennis From pollack at cis.ohio-state.edu Wed Aug 23 20:19:13 1989 From: pollack at cis.ohio-state.edu (Jordan B Pollack) Date: Wed, 23 Aug 89 20:19:13 EDT Subject: Tech Reports Available by FTP Message-ID: <8908240019.AA00867@toto.cis.ohio-state.edu> **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** It is really quite costly, in fact, to make 200 copies of big papers and mail them off around the world, especially since that is what journals make profit by doing! The original idea was not for self-publicity, but as an advanced circulation announcement for friends and colleagues. Here is a self-organizing solution which moves the cost of paper to the requestor and the cost of postage to the spare capacity of the network! We can start a database of technical reports in postscript (or perhaps TeX, if you're not worried about prosecurity). Local sources have donated a few tens of megabytes to the connectionist cause, in the directory pub/neuroprose on the host cheops.cis.ohio-state.edu You can PUT or GET articles in a few seconds or minutes, tex them if necessary, and zip them off on your nearest postscript laser printer. Of course, it helps if all of figures are in place, appended, or in a separate but closely named postscript file. For example, I am hereby announcing the FTP availability of postscript versions of a couple of articles in the publishing pipeline. Note the naming convention, which should evolve from the rather simple scheme of (author,subject,filetype): pollack.newraam.ps latest revision of RAAM paper pollack.perceptrons.ps widely advertised bookreview to appear in JMathPsych Here is how you can get them: ftp cheops.cis.ohio-state.edu (or, ftp 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> ls pollack.newraam.ps pollack.perceptrons.ps ftp> get (remote-file) pollack.newraam.ps (local-file) foo.ps 261245 bytes sent in 9.9 seconds (26 Kbytes/s) ftp> get (remote-file) pollack.perceptrons.ps (local-file) bar.ps 65413 bytes sent in 3.5 seconds (18 Kbytes/s) ftp> quit unix> lpr *.ps Please put your own TR's there and announce their availability to the mailing list. It is certainly more work than just hitting "R", but probably worth it, all around. Except, perhaps, to whomever pays for network bandwidth. We could even eventually submit journal articles this way... Jordan **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** **********DO NOT FORWARD TO OTHER BBOARDS************** From elman at amos.ucsd.edu Thu Aug 24 11:58:23 1989 From: elman at amos.ucsd.edu (Jeff Elman) Date: Thu, 24 Aug 89 08:58:23 PDT Subject: CRL TR 8903: Representation and structure in connectionist models Message-ID: <8908241558.AA00824@amos.ucsd.edu> -------------------------------------------------- CRL Tech Report 8903 August 1989 "Representation and Structure in Connectionist Models" Jeffrey L. Elman Departments of Cognitive Science and Linguistics University of California, San Diego ABSTRACT This paper focuses on the nature of representations in connectionist models. It addresses two issues: (1) Can connectionist models develop representations which possess internal structure and which provide the basis for productive and systematic behavior; and (2) Can representations which are fundamentally context-sensitive support grammatical behavior which appears to be abstract and general? Results from two simulations are reported. The simulations address problems in the distinction between type and token, the representation of lexical categories, and the representation of grammatical structure. The results suggest that connectionist representations can indeed have such internal structure and exhibit systematic behavior, and that a mechanism which is sensitive to context is capable of capturing generalizations of varying degrees of abstractness. -------------------------------------------------- Copies may be requested by sending your name and address to yvonne at amos.ucsd.edu From tedwards at cmsun.nrl.navy.mil Mon Aug 28 14:38:31 1989 From: tedwards at cmsun.nrl.navy.mil (Thomas Edwards) Date: Mon, 28 Aug 89 14:38:31 EDT Subject: Connections Per Second Message-ID: <8908281838.AA08753@cmsun.nrl.navy.mil> I have been wondering about interconnections per second ratings. Suppose: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second? Is my claim still acceptable if I am actually performing the neural network calculations by systollic array matrix multiplication? I'll be the first to admit that interconnections per second is a speed measure which does not neccessarily reflect the reality of the neural processing system it is measured on, but I've just been reading some Cray claims that the X-MP (1 CPU) should be able to do 50 million interconnections per second, and the Connection Machine is capable of only 13 million (which seems fairly slow compared to some of the benchmarks I have run). -Thomas Edwards ins_atge at jhunix.hcf.jhu.edu tedwards at cmsun.nrl.navy.mil ins_atge at jhuvms.BITNET From chrisley at arisia.xerox.com Mon Aug 28 17:51:36 1989 From: chrisley at arisia.xerox.com (Ron Chrisley UNTIL 10/3/88) Date: Mon, 28 Aug 89 14:51:36 PDT Subject: Connections Per Second Message-ID: <8908282151.AA25599@kanga.parc.xerox.com> Thomas Edwards wrote: "I have been wondering about interconnections per second ratings. Suppose: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second?" I don't know about the rest of your calculations, or how anyone else does this kind of comparison, but you should probably add in your bias weights if you have any. That should be about 192 more weights. Why did you not include both layers of weights for the back prop contribution to the total? Also, I think some people distinguish between CUPS (connection updates per second) ratings during learning, and CUPS ratings during recall. Make sure you aren't comparing apples and oranges... Ron Chrisley From singer at Think.COM Mon Aug 28 21:13:39 1989 From: singer at Think.COM (singer@Think.COM) Date: Mon, 28 Aug 89 21:13:39 EDT Subject: Connections Per Second In-Reply-To: Thomas Edwards's message of Mon, 28 Aug 89 14:38:31 EDT <8908281838.AA08753@cmsun.nrl.navy.mil> Message-ID: <8908290113.AA04579@kulla.think.com> Date: Mon, 28 Aug 89 14:38:31 EDT From: Thomas Edwards I have been wondering about interconnections per second ratings. Suppose: [...] reading some Cray claims that the X-MP (1 CPU) should be able to do 50 million interconnections per second, and the Connection Machine is capable of only 13 million (which seems fairly slow compared to some of the benchmarks I have run). With all due respect to the issues of benchmarking neural networks and with no intention of trivializing this intellectual forum with timings, I feel obligated by my corporate loyalties to mention that the last neural network timings done on a Connection Machine were 80 million weight updates per second (paper to be presented at NIPS '89). Alexander Singer Thinking Machines Corp. singer at think.com From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Aug 29 00:55:42 1989 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 29 Aug 89 00:55:42 EDT Subject: Connections Per Second In-Reply-To: Your message of Mon, 28 Aug 89 14:38:31 -0400. <8908281838.AA08753@cmsun.nrl.navy.mil> Message-ID: A 128 input units, 64 hidden units, 128 output units 3-layer backprop network (learning the encoder problem, say). Let's say I can do 50 epochs of 128 exemplars in 41.54 seconds of this network. Is it valid to say: Number of connections per exemplar = (128 * 64) + (128 * 64) [forward prop] + (128 * 64) [back prop] = 24576 Number of connections per epoch = 24576 * 128 = 3145728 Number of connections per 50 epochs = 3145728 * 50 = 157286400 Divide by run time for 50 epochs = 157286400/41.54 = 3786384.2 Now, is it accurate to say my backprop program runs at 3.7 million interconnections per second? Is my claim still acceptable if I am actually performing the neural network calculations by systollic array matrix multiplication? I think that most of the CPS numbers that people quote are derived by multiplying the total number of connections in the net times the number pattern presentations divided by the total time required. You would not count a connection twice just because the backprop algorithm happens to use some or all of the weights twice. For your example, I get (64 * 129) + (128 * 65) = 16,576 connections (don't forget the bias connections), 6400 pattern presentations, and 41.54 seconds, for a grand total of 2.55 million CPS. At least, that's how I would compute these numbers. Your milage may differ. I suspect that some of the CPS numbers quoted for commercial neural net simulators were computed using some other formula. When I talk to people selling these simulators, I always try to pin them down on how their CPS numbers were computed, but the marketing types seldom know the gory details. I think that it is reasonable to report "equivalent" CPS numbers, even if your program actually does the computation in some other way -- computing many epochs in parallel, for example. -- Scott From beer at icsib.Berkeley.EDU Tue Aug 29 19:29:09 1989 From: beer at icsib.Berkeley.EDU (Joachim Beer) Date: Tue, 29 Aug 89 16:29:09 PDT Subject: Connections per Second Message-ID: <8908292329.AA11907@icsib.Berkeley.EDU> The recent discussion about CPS numbers made me wonder why we need such new metrics at all. Why not just record the execution time in seconds? Here is what I don't like about CPS numbers: * They don't facilitate the performance comparison of connectionist models and non-connectionist models (e.g. statistical pattern classifiers). Are we afraid of such comparisons? * If CPS numbers become the accepted yard stick with which to measure execution models then one is likely to penalize new approaches which are slower per connection but, overall, require fewer connections. * CPS numbers are very hard to interpret, as we have seen in the recent discussion about CPS. Every performance metric is open to abuse, but measuring performance in absolute execution time seems to me the least ambiguous metric. After all, the user is not interest in how many connections are updated per second but how long it will take to solve a (benchmark) problem. A similar development has taken place in the field of Logic Programming, there everything is measured in KLIPS (Kilo logical inference per second). This has led to such a confusion that it is now virtually impossible to fairly compare competing execution models (not to mention comparison with other programming paradigms). Let's hope this will not happen to connectionism. -Joachim Beer From Scott.Fahlman at B.GP.CS.CMU.EDU Tue Aug 29 20:43:21 1989 From: Scott.Fahlman at B.GP.CS.CMU.EDU (Scott.Fahlman@B.GP.CS.CMU.EDU) Date: Tue, 29 Aug 89 20:43:21 EDT Subject: Connections per Second In-Reply-To: Your message of Tue, 29 Aug 89 16:29:09 -0700. <8908292329.AA11907@icsib.Berkeley.EDU> Message-ID: Why not just record the execution time in seconds? Because implementations that differ by several orders of magnitude in raw speed cannot be compared using the identical network. For some simulators it would take forever to measure the runtime of any net with more than 100 connections; for other implementations, that's not even enough to fill up the pipeline. As long as the time required by a given implementation grows more or less linearly with the number of connections in the network, CPS tells you something useful about that implementation: how long it will take to run some number of passes on a given network. And if two systems implement the same algorithm (e.g. standard backprop), then the CPS numbers give you a rough way of comparing them: one Warp equals three Commodore-64's, or whatever. * They don't facilitate the performance comparison of connectionist models and non-connectionist models (e.g. statistical pattern classifiers). Are we afraid of such comparisons? CPS numbers are clearly not useful for comparing two different neural net algorithms, unless they are very close to one another or unless there is some formal transformation from one algorithm into another. A Boltzmann CPS is very different from a backprop CPS or a Hopfield CPS. So of course there's no way to compare backprop to some statistical classifier using CPS numbers -- they are only good for comparing members of the same family. You're right, when it comes to comparing the speed of a given connectionist model against a statistical model (or two very different connectionist models), about the only way we can do it is to compare total runtime, on the same machine, programmed by equally competent hackers, and with the same degree of optimization for speed vs. flexibility. If any of these conditions doesn't hold, you can be off by a large factor, but still you might be able to use the result for a crude order-of-magnitude comparison of runtimes. * If CPS numbers become the accepted yard stick with which to measure execution models then one is likely to penalize new approaches which are slower per connection but, overall, require fewer connections. Usually the tradeoff is between the speed of a single training cycle and the number of cycles needed, not the size of the net. But in any case, I think there is no danger as long as we all realize that CPS is useless in comparing two significiantly different algorithms. * CPS numbers are very hard to interpret, as we have seen in the recent discussion about CPS. That's why we're having this discussion. Maybe we can agree on a common metric that is useful, at least in a very limited set of circumstances. I suspect that the CPS numbers currently being tossed around for backprop are ambiguous by a factor of two, and it would be nice to sort that out. -- Scott From chrisley at arisia.xerox.com Tue Aug 29 22:05:18 1989 From: chrisley at arisia.xerox.com (Ron Chrisley UNTIL 10/3/88) Date: Tue, 29 Aug 89 19:05:18 PDT Subject: Connections per Second Message-ID: <8908300205.AA04929@kanga.parc.xerox.com> Joachim Beer writes: "The recent discussion about CPS numbers made me wonder why we need such new metrics at all. Why not just record the execution time in seconds?... After all, the user is not interest in how many connections are updated per second but how long it will take to solve a (benchmark) problem." The CUPS comparison should be between *implementations* of standard connectionist networks, not between different models themselves. I don't know of any one comparing ART to backprop using CUPS, but I do see people comparing various network systems (like SAIC, HNC, Rochecter connectionist simulator on machine XYZ, etc) that implement standard models. This measurement, if the calculation procedure is standardized, is useful. You're right that the user wants to know how long it will take, in seconds, for the network to solve a problem, but they usually want to know how long it will take for the network to solve *their* problem, not a benchmark one. This is where CUPS ratings come in handy. If we *banned* [:-)] CUPS ratings, people would use them themselves anyway: "Let's see, if HNC told me that it takes 42.21 seconds for them to do 150 iterations of XOR learning with network configuration Z, then that means that they can do N CUPS, so that means that they could do 100 iterations of my problem with network configuration X in under 76 seconds." Of course, this says nothing about how long (in iterations) it takes to *solve* a particular problem. But reporting absolute times for benchmarks doesn't either (at least not until we have a good theory how different problems relate to each other in terms of learning difficulty. I should live so long... ?-) Ron Chrisley From hinton at ai.toronto.edu Wed Aug 30 10:34:38 1989 From: hinton at ai.toronto.edu (Geoffrey Hinton) Date: Wed, 30 Aug 89 10:34:38 EDT Subject: Connections Per Second In-Reply-To: Your message of Tue, 29 Aug 89 09:18:00 -0400. Message-ID: <89Aug30.103535edt.11258@ephemeral.ai.toronto.edu> I agree with previous messages that the problem with reporting "connections per second" is that it could mean many different things: 1. CPS for just a forward pass (2 FLOPS per connection) 2. CPS for forward & backward passes (usually 6 FLOPS per connection) 3. CPS for forward & backward passes on connections from input units (4 FLOPS) 4. CPS for forward & backward passes with online weight updates (8 or 6 FLOPS). The obvious solution is to just report the megaflops, where the only operations that are counted are the floating point operations in the inner loops (i.e. we ignore any computations used to implement the sigmoid etc, but we include the weight updating if we are using the online method, so we use the numbers given above). I expect that unscrupulous salespersons will still try to cheat by counting operations like loads and stores and increments to loop counters in the megaflop rating, but at least the researchers will have a standard figure that can be used to compare machines and simulators. Geoff From beer%icsia2.Berkeley.EDU at berkeley.edu Wed Aug 30 13:26:20 1989 From: beer%icsia2.Berkeley.EDU at berkeley.edu (Joachim Beer) Date: Wed, 30 Aug 89 10:26:20 PDT Subject: Connections per Second In-Reply-To: Your message of Tue, 29 Aug 89 20:43:21 -0400. <8908300044.AA12383@icsib.Berkeley.EDU> Message-ID: <8908301726.AA00707@icsia2> I still believe execution time is a more versatile metric than CUPS. Of course, it all depends on what you want to benchmark. You might want to compare different algorithms or benchmark connectionist simulators/machines. It is clear that CUPS don't help you to compare different connectionist models. However, most simulators try to support a wide range of different connectionist models. Knowing their CUPS performance for standard backprop does not help me to asses their potential performance for different connectionist models. I would therefore like to see a connectionist benchmark set that incorporates as many different models as possible. I realize that this is probably asking to much, but, for example, even backprop can be implemented in many alternative ways: on-line, off-line, conjugated gradient methods, etc. For example, in order to evaluate a backprop simulator/machine I would like to measure the simulator on the above alternative implementation models, because they have slightly different computational requirements. Just standard (on-line?) backprop CUPS performance is not very informative for anything that aims to be more than just a standard backprop machine or simulator. To make it short, CUPS apply only to one precisely defined benchmark program (usually standard backprop). In my opinion this is to narrow a definition of a benchmark metric. Can one define a (artifical?) benchmark set that reflects as best as possible the operational and computational requirements of connectionist networks in general? Something the connectionist community could agree on. -Joachim From huyser at mithril.stanford.edu Wed Aug 30 13:43:53 1989 From: huyser at mithril.stanford.edu (Karen Huyser) Date: Wed, 30 Aug 89 10:43:53 PDT Subject: Connections per second Message-ID: <8908301743.AA15029@mithril.Stanford.EDU> Geoffrey Hinton writes: " The obvious solution is just to report the megaflops . . ." I disagree. To me, the obvious solution is to work to standardize an appropriate definition of CPS for each learning algorithm and to state the definition used in the calculation whenever the CPS measure is quoted. Karen Huyser From delta at csl.ncsu.edu Thu Aug 31 10:49:38 1989 From: delta at csl.ncsu.edu (Thomas Hildebrandt) Date: Thu, 31 Aug 89 10:49:38 EDT Subject: Desire Fukushima Network Simulator Message-ID: <8908311449.AA21098@csl> Dear All: I would like to investigate the work of Kazuhiko Fukushima more fully by duplicating his experiments as closely as possible. If anyone knows of the existence of a working simulator for his 'neocognitron' model, I would appreciate receiving information about it. Naturally, source code would be of inestimable value. Thanks. Thomas H. Hildebrandt (delta at csl36h.ncsu.edu) From INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU at murtoa.cs.mu.oz Tue Aug 22 19:23:00 1989 From: INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU at murtoa.cs.mu.oz (INS_ATGE%JHUVMS.BITNET%VMA.CC.CMU.EDU@murtoa.cs.mu.oz) Date: Tue, 22 Aug 89 18:23 EST Subject: Neural Net Archive? Message-ID: Has anyone considered (or possibly created) an archive site for trained neural networks? Why spend thousands of epochs of learning to create you network weights, only to throw them away after your research is done? If anyone feels such an archive site may be of use, please send me email (as it would be helpful to me as I lobby for a site at Hopkins). -Thomas Edwards ins_atge at jhuvms.bitnet ins_atge at jhunix.hxf.jhu.edu tedwards at cmsun.nrl.navy.mil Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,862,620732672,600 691phys4.anucluster.cs.su.ozmailerID=send+Yzu1z0FF4,ccadfa=01d"71 From delta%csl.ncsu.edu at murtoa.cs.mu.oz Thu Aug 31 10:49:38 1989 From: delta%csl.ncsu.edu at murtoa.cs.mu.oz (Thomas Hildebrandt) Date: Thu, 31 Aug 89 10:49:38 EDT Subject: Desire Fukushima Network Simulator Message-ID: Dear All: I would like to investigate the work of Kazuhiko Fukushima more fully by duplicating his experiments as closely as possible. If anyone knows of the existence of a working simulator for his 'neocognitron' model, I would appreciate receiving information about it. Naturally, source code would be of inestimable value. Thanks. Thomas H. Hildebrandt (delta at csl36h.ncsu.edu) Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,833,620803067,600/q691phys4.anucluster.cs.su.ozmailerID=send+Z+9Dv1ThP,ccadfa=01Ms71 From huyser%mithril.stanford.edu at murtoa.cs.mu.oz Wed Aug 30 13:43:53 1989 From: huyser%mithril.stanford.edu at murtoa.cs.mu.oz (Karen Huyser) Date: Wed, 30 Aug 89 10:43:53 PDT Subject: Connections per second Message-ID: Geoffrey Hinton writes: " The obvious solution is just to report the megaflops . . ." I disagree. To me, the obvious solution is to work to standardize an appropriate definition of CPS for each learning algorithm and to state the definition used in the calculation whenever the CPS measure is quoted. Karen Huyser Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,736,620802835,600P]691phys4.anucluster.cs.su.ozmailerID=send+Z+9AH1TQC,ccadfa=01;71 From tenorio%ee.ecn.purdue.edu at murtoa.cs.mu.oz Tue Aug 22 14:20:21 1989 From: tenorio%ee.ecn.purdue.edu at murtoa.cs.mu.oz (Manoel Fernando Tenorio) Date: Tue, 22 Aug 89 13:20:21 EST Subject: reports from PURDUE Message-ID: Bcc: -------- The SONN algorithm report is now going to be distributed free, thanks to the overwhelming response of hundreds on the net and our good administration, willing to quickly review outdated practices to accomodate the sharing spirit present in the net. The cost of such publication for any institutions can be burdensome. I would advise my colleagues to whenever possible practice money saving strategies such as local redistribution of hard copies. This is to avoid in the future no one being able to pay the report price tag. I also would like to suggest the setting up of a few (one?) electronic standard for report formatting so that electronic sharing can be viable. Problably we should take a vote on this. Those that asked for the report, sit tight, it is coming... --ft. Dphys4neuroz at phys4.anu;act at eeadfa.eerayMail,1217,620732672,600z701phys4.anucluster.cs.su.ozmailerID=send+Yzu1z0FH7,ccadfa=01 at 071