paper available: fast dictionary search
Lucas, Simon M
sml at essex.ac.uk
Wed Oct 18 09:07:57 EDT 2000
Dear All,
The following paper on graph-based dictionary
search (due to appear in Pattern Recognition
Letters) is available at:
http://algoval.essex.ac.uk/papers/dictionary/prl.ps
As far as I am aware, the method described is
unique in that the retrieval speed is independent
of the size (number of entries) in the dictionary.
The core of the method is based on a lazy matrix
parser for probabilistic context-free grammars,
which are a type of probabilistic
graphical model. Therefore, it may be interesting
to explore other possible uses of the method in that
area.
As always, comments are welcome.
Best regards,
Simon Lucas
----------------------------------------------------
Title: Efficient graph-based dictionary search
and its application to text-image searching
Keywords: dictionary search, text image indexing,
graph search
Abstract
---------
This paper describes a novel method for applying
dictionary knowledge to optimally interpret the
confidence-rated hypothesis sets produced
by lower-level pattern classifiers.
This problem arises whenever image or video
databases need to be scanned for textual content,
and where some of the text strings are expected
to be strings from a dictionary. The method
is especially appropriate for large dictionaries,
as might occur in vehicle registration number
recognition for example.
The problem is cast as enumerating the paths in
a graph in best-first order given the constraint
that each complete path is a word in
some specified dictionary. The solution
described here is of particular interest
due to its generality, flexibility and
because the time to retrieve each path is
independent of the size of the dictionary.
Synthetic results are presented for searching
dictionaries of up to 1 million UK postcodes
given graphs that correspond to insertion,
deletion and substitution errors.
We also present initial results from processing
real noisy text images.
--------------------------------------------------
Dr. Simon Lucas
Senior Lecturer and MSc E-commerce Director
Department of Computer Science
University of Essex
Colchester CO4 3SQ
United Kingdom
Email: sml at essex.ac.uk
http://cswww.essex.ac.uk
--------------------------------------------------
More information about the Connectionists
mailing list