Paper Announcement
Arun Jagota
jagota at cs.Buffalo.EDU
Thu Dec 13 19:28:20 EST 1990
*************** DO NOT FORWARD TO OTHER BBOARDS*****************
[Note : Please do not reply with 'r' or 'R' to this message]
The following paper, submitted to a special issue of IJPRAI, is now available :
It describes a substantial extension of work presented at IJCNN-90, San Diego
Degraded Printed Word Recognition with a Hopfield-style Network
Arun Jagota
(jagota at cs.buffalo.edu)
Department of Computer Science
State University Of New York At Buffalo
ABSTRACT
In this paper, the Hopfield-style network, a variant of the discrete Hopfield
network, is applied to (degraded) machine printed word recognition. It is
seen that the representation and dynamics properties of this network map
very well to this problem. Words to be recognised are stored as content-
addressable memories. Word images are first processed by a hardware OCR. The
network is then used to postprocess the OCR decisions. It is shown (on postal
word images) that for a small stored dictionary (~500 words), the network
exact recall performance is quite good, for a large (~10,500 words) dictionary,
it deteriorates dramatically, but the network still performs very well at
"filtering the OCR output". The benefit of using the network for "filtering"
is demonstrated by showing that a specific distance based search rule, on a
dictionary of 10,500 words, gives much better word recognition performance
(71% TOP choice, 84% TOP 2.6) on the network (filtered) output than on the
raw OCR output. It is also shown, that for such "filtering", a special case
of the network with two-valued weights performs almost as well as the general
case, which verifies that the essential processing capabilities of the network
are captured by the graph underlying it, the specific values of the +ve weights
being relatively unimportant. This might also have implications for low prec-
ision implementation. The best time efficiency is found when the dictionary of
~10,500 words is stored and the network is used to "filter" OCR output for
266 images. The training + filtering, together, take only 2 watch-timed minutes
on a SUN Sparc Station.
------------------------------------------------------------------------
(Raw) FootNote: This problem seems cumbersome if viewed as one of supervised
function learning (feed-forward NNs, Bayesian) due to large number (~10,500)
of classes (=> large training sets/times). Conventional treatment is dictionary
storage + search problem but drawback of sequential search is large search time
during testing. The Hopfield-style network can be viewed as particular form of
(unsupervised) distributed dictionary storage + search-by-recurrent-dynamics.
Search is rapid and almost independent of dictionary size. The catch is that
functional performance deteriorates rapidly with dictionary size. All is not
lost, however, because partial performance (filtering) remains very good.
[[Comments on above welcome but please mail to (jagota at cs.buffalo.edu) directly,
not to CONNECTIONISTS]]
The paper is available in compressed PostScript form by anonymous ftp
unix> ftp cheops.cis.ohio-state.edu (or, ftp 128.146.8.62)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get jagota.wordrec.ps.Z
ftp> quit
unix> uncompress jagota.wordrec.ps.Z
unix> lpr jagota.wordrec.ps
------------------------------------------------------------------------
Previous Postscript incompatibility problems have, by initial assessment, been
corrected. Nevertheless, the paper is also available by e-mail (LaTeX sources)
or surface mail (in that prefered order).
Arun Jagota Dept Of Computer Science
jagota at cs.buffalo.edu 226 Bell Hall,
State University Of New York At Buffalo,
NY - 14260
*************** DO NOT FORWARD TO OTHER BBOARDS*****************
More information about the Connectionists
mailing list