Paper Announcement

Thu Dec 13 19:28:20 EST 1990

*************** DO NOT FORWARD TO OTHER BBOARDS*****************
[Note : Please do not reply with 'r' or 'R' to this message]

The following paper, submitted to a special issue of IJPRAI, is now available :
It describes a substantial extension of work presented at IJCNN-90, San Diego

       Degraded Printed Word Recognition with a Hopfield-style Network

                               Arun Jagota
		          (jagota at cs.buffalo.edu)
                      Department of Computer Science
                  State University Of New York At Buffalo

                               ABSTRACT

In this paper, the Hopfield-style network, a variant of the discrete Hopfield 
network, is applied to (degraded) machine printed word recognition. It is 
seen that the representation and dynamics properties of this network map 
very well to this problem.  Words to be recognised are stored as content-
addressable memories. Word images are first processed by a hardware OCR. The 
network is then used to postprocess the OCR decisions. It is shown (on postal 
word images) that for a small stored dictionary (~500 words), the network 
exact recall performance is quite good, for a large (~10,500 words) dictionary, 
it deteriorates dramatically, but the network still performs very well at 
"filtering the OCR output". The benefit of using the network for "filtering" 
is demonstrated by showing that a specific distance based search rule, on a
dictionary of 10,500 words, gives much better word recognition performance 
(71% TOP choice, 84% TOP 2.6) on the network (filtered) output than on the 
raw OCR output. It is also shown, that for such "filtering", a special case 
of the network with two-valued weights performs almost as well as the general 
case, which verifies that the essential processing capabilities of the network 
are captured by the graph underlying it, the specific values of the +ve weights 
being relatively unimportant. This might also have implications for low prec-
ision implementation. The best time efficiency is found when the dictionary of 
~10,500 words is stored and the network is used to "filter" OCR output for 
266 images. The training + filtering, together, take only 2 watch-timed minutes 
on a SUN Sparc Station.
------------------------------------------------------------------------

(Raw) FootNote: This problem seems cumbersome if viewed as one of supervised 
function learning (feed-forward NNs, Bayesian) due to large number (~10,500) 
of classes (=> large training sets/times). Conventional treatment is dictionary
storage + search problem but drawback of sequential search is large search time
during testing. The Hopfield-style network can be viewed as particular form of 
(unsupervised) distributed dictionary storage + search-by-recurrent-dynamics. 
Search is rapid and almost independent of dictionary size. The catch is that 
functional performance deteriorates rapidly with dictionary size. All is not 
lost, however, because partial performance (filtering) remains very good.

[[Comments on above welcome but please mail to (jagota at cs.buffalo.edu) directly,
not to CONNECTIONISTS]]

The paper is available in compressed PostScript form by anonymous ftp 

unix> ftp cheops.cis.ohio-state.edu  (or, ftp 128.146.8.62)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get jagota.wordrec.ps.Z
ftp> quit
unix> uncompress jagota.wordrec.ps.Z
unix> lpr jagota.wordrec.ps  
------------------------------------------------------------------------
Previous Postscript incompatibility problems have, by initial assessment, been
corrected. Nevertheless, the paper is also available by e-mail (LaTeX sources)
or surface mail (in that prefered order).

Arun Jagota 			Dept Of Computer Science
jagota at cs.buffalo.edu 		226 Bell Hall,
				State University Of New York At Buffalo,
				NY - 14260

*************** DO NOT FORWARD TO OTHER BBOARDS*****************