Shift Invariance

Irving Biederman ib at rana.usc.edu
Fri Mar 15 19:02:03 EST 1996


Shimon Edelman (March 8) writes:
[Omission of some of posting]
>Note that the purpose of my previous posting was to advocate caution,
>certainly not to argue that all claims of invariance are wrong.
>Fortunately, my job in this matter is easy: just one example of a
>manifest lack of invariance suffices to invalidate the strong version
>of invariance-based theory of vision, which seems to be espoused by
>Goldfarb:

>So, here it goes... Whereas invariance does hold in many recognition
>tasks (in particular, in Biederman's experiments, as well as in the
>experiments reported in [1]), it does not in others (as, e.g., in [2],
>where interaction between size invariance and orientation is
>reported). A recent comprehensive survey of (the far from invariant)
>human performance in recognizing rotated objects can be found in
>[3]. Furthermore, not only recognition, but also perceptual learning,
>seems to be non-invariant in some cases; see [4,5].

[Omission of rest of posting]

        It should be so easy.

        Of course, ALL of vision is not shift invariant (I don't believe
that Goldfarb was asserting that it was) as there is clear evidence that
people are, for example, quite sensitive to the location of objects when
they reach out to grasp them.   The issue of shift invariance was
specifically raised, not for ALL of vision, but the domain of (what should
be called) object recognition, what I termed "primal access", Biederman,
'87, in which basic-level or (most) subordinate-level classification is
made from a large and uncertain population of objects, as when channel
surfing.

        I think that readers who are not familiar with some of the
literature cited in Edelman's posting might be misled into thinking that
shift invariance in object recognition is a special case.  As I noted in my
previous posting, the evidence is quite strong that object recognition
tasks, at the same time that they show a visual (and not just verbal or
conceptual) benefit from a single presentation in an experiment, also show
shift invariance.  (They also show size, scale, reflection, and rotation in
depth invariance, as long as the same parts and relations are readily
distinguished.)   Edelman points out that there have been reports of
view-dependency for depth rotation, not shift, in "recognition" tasks.
(Goldfard specifically exempted rotation.)  But even for depth rotation,
readers should note that the findings of large rotation costs are found
only for extremely difficult discrimination tasks, performed only rarely in
normal visual activities in which viewpoint-invariant information is
generally not available, such as distinguishing among a set of highly
similar bent paper clips.

        Why would invariance not be found with extremely difficult tasks?
When tasks are difficult, subjects will attempt various strategies (e.g.,
look to the left [a dorsal function?] for a small, distinguishing feature),
that might produce a cost of view-change, but this does not mean that the
representation of the feature (or object) itself is not invariant.  All in
all, the absence of an effect of a view-change, puts one in a simpler
explanatory position (assuming adequate power), than when an effect of view
change (say, a shift) is found.  The latter kind of result means that one
has to eliminate other task variables as potential bases of the effect,
such as a search for a distinguishing feature, as noted above.   A finding
of an effect of a change in viewpoint in "object recognition" might or
might not mean that the representation of the object is viewpoint
dependent.  The "view-based" camp will have to demonstrate that the
representation of an object (for primal access) really does change when it
is shifted, or shown at a different size, or orientation in depth (assuming
that the same parts are in view).  They haven't done this yet.

        Whether a TASK (NOT A REPRESENTATION) does or does not manifest
shift invariance might well depend on the degree to which it reflects
dorsal (motor interaction) vs. ventral (recognition) cortical
representations.  The manifestation of these invariances nicely dovetails
with the phenomenon of "object constancy" noted by the Gestaltists, in
which the perception of the real object is largely unaffected by its
translation or rotation.  It is of interest that patient D. F. studied by
Milner and Goodale, who presumably has a damaged ventral pathway shows no
awareness of objects while at the same time is able to reach competently
for them.

        My views on these matters of view invariance (especially of
rotation in depth) are more fully presented in:

1.  Biederman, I., & Gerhardstein, P. C.  (1993).  Recognizing
depth-rotated objects:  Evidence and conditions for 3D viewpoint
invariance.  Journal of Experimental Psychology:  Human Perception and
Performance, 19, 1162-1182.

2.  Biederman, I., & Gerhardstein, P. C.  (1995).  Viewpoint-dependent
mechanisms in visual object recognition:  Reply to Tarr and Bülthoff
(1995).  Journal of Experimental Psychology:  Human Perception and
Performance, 21, 1506-1514.

3.  Biederman, I., & Bar, M.  (1995).  One-Shot Viewpoint Invariance with
Nonsense Objects.  Paper presented at the Annual Meeting of the Psychonomic
Society, 1995, Los Angeles, November.  Available on our WWW site:
http://rana.usc.edu:8376/~ib/iul.html






More information about the Connectionists mailing list