Vision (What's wrong with Marr's model)
Steve Lehar
slehar at park.bu.edu
Tue Jan 8 08:28:18 EST 1991
Frank Smieja <gmdzi!smieja at relay.EU.net> asks about recent objections
to Marr's theory of vision. Here is my opinion.
David Marr's book is delightfully lucid and beautifully illustrated,
and I thoroughly agree with his analysis of the three levels of
modeling. Nevertheless I believe that there are two fatal flaws in
the philosophy of his vision model.
The first fatal flaw is the feedforward nature of this model, from the
raw primal sketch through the 2&1/2 D sketch to the 3-D model
representation. Decades of "image understanding" and "pattern
recognition" research have shown us that such feed-forward processing
has a great deal of difficulty with natural imagery. The problem lies
in the fact that whenever "feature extraction" or "image enhancement"
are performed, they recognize or enhance some features but in the
process they inevitably degrade others or introduce artifacts. With
successive levels of processing the artifacts accumulate and combine
until at the highest levels of processing there is no way to
distinguish the real features from the artifacts. Even in our own
vision, with all its sophistication, we occasionally see things that
are not there. The real problem with such feedforward models is that
once a stage of processing is performed, it is never reviewed or
reconsidered.
Grossberg suggests how nature solves this problem, by use of top-down
feedback. Whenever a feature is recognized at any level, a copy of
that feature is passed back DOWN the processing hierarchy in an
attempt to improve the match at the lower levels. If for instance a
set of disconnected edges suggest a larger continuous edge to a higher
level, that "hypothesis" is passed down to the local edge detectors to
see if they can find supporting evidence for the missing pieces by
locally lowering their detection thresholds. If a faint edge is
indeed found where expected, it is enhanced by resonant feedback. If
however there is strong local opposition to the hypothesis then the
enhancement is NOT performed. This is the cooperative / competitive
loop of the BCS model which serves to disambiguate the image by
simultaneous matching at multiple levels. This explains how, when we
occasionally see something that isn't there, we see it in such detail
until at a higher level a conflict occurs, at which time the
apparition "pops" back to being something more consistant with the
global picture.
The second fatal flaw in Marr's vision model is related to the first.
In the finest tradition of "AI", Marr's 3-D model is an abstract
symbolic representation of the visual input, totally divorced from the
lower level stimuli which generated it. The great advance of the
connectionist perspective is that manipulation of high level symbols
is meaningless without regard to the hierarchy of lower level
representations to which they are attached. When you look at your
grandmother for instance, some high level node (or nodes) must fire in
recognition. At the same time however you are very conscious of the
low level details of the image, the strands of hair, the wrinkles
around the eyes etc. In fact, even in her absence the high level node
conjurs up such low level features, without which that node would have
no real meaning. It is only because that node rests on the pinacle of
a hierarchy of such lower level nodes that it has a meaning of
"grandmother". The perfectly gramatical sentence "Grandmother is
purple" is only recognized as nonsense when visualized at the lowest
level, illustrating that logical processing cannot be separated from
low level visualization.
Although I recognize Marr's valuable and historic contribution to the
understanding of vision, I believe that in this fast moving field we
have already progressed to new insights and radically different
models. I would be delighted to provide further information by email
to interested parties on Grossberg's BCS model, and my own
implementation of it for image processing applications.
(O)((O))(((O)))((((O))))(((((O)))))(((((O)))))((((O))))(((O)))((O))(O)
(O)((O))((( slehar at park.bu.edu )))((O))(O)
(O)((O))((( Steve Lehar Boston University Boston MA )))((O))(O)
(O)((O))((( (617) 424-7035 (H) (617) 353-6741 (W) )))((O))(O)
(O)((O))(((O)))((((O))))(((((O)))))(((((O)))))((((O))))(((O)))((O))(O)
More information about the Connectionists
mailing list