Analysis of time sequences without a window
Robert Port
port at iuvax.cs.indiana.edu
Wed Aug 28 13:38:09 EDT 1991
Tenorio mentioned several schemes for collecting information about
patterns in time sequences. As he noted, several
are essentially the same -- Time Windows and Delay Lines --
since a static (or fixed-window) slice of the signal
is stored and advanced through a structure that preserves
all info in the inputs -- that is, it preserves a raw record of
inputs. (Im not sure what he means by `recursion'.)
As long as the bandwidth of input is very small (that is, when
there is a small set of possible inputs), this technique is fine.
(Though, as noted before, an apriori window size imposes
a limit on the maximum length of pattern that can be learned).
But for a domain like HEARING, where the entire acoustic
spectrum is sampled (at sampling rates that vary with
frequency), the idea of storing EVERYTHING that comes in is
computationally intractable -- at least, it apparently is for
human hearing (and surely hearing in other animals as well).
Despite the intuitive appeal of theories about `echoic memory',
`precategorical acoustic store', etc, the evidence shows
that these `acoustic memories' do NOT contain anything like
raw sound spectra for any length of time.
Instead, these memories should be called `auditory' since
they contain `names' (or categories of some sort) for
learned patterns. (See my paper in Connection Science, 1990)
One source of evidence for this is simply that when an acoustic
temporal pattern is REALLY NOVEL - eg, an artificial pattern
completely unlike speech or other sounds from our environment -
then listeners do NOT have a veridical representation of it
that can be retained for a second or two. See experiments
by CS Watson on patterns of 5-10-tones presented
within a half second or so. The patterns are random-freq
pure-tone sequences (patterns that impressionistically
resemble the sound of a Touch-Tone phone when it
auto-redials, or maybe even a turkey gobble). It is incredibly
difficult to detect changes in, say, the frequency of
one of the tones -- at least it's hard as long as the pattern is
`unfamiliar'. And to really learn the pattern (to near-asymptotic
performance level) requires literally thousands of practice trials!
So what could familiar, learned auditory memories be like if
they aren't specified within in a raw time window of the
acoustic signal?
I think the answer is Tenorio's other type: Hysteresis.
This refers to the effect where some properties of past
inputs affect system response to the current
input. A concrete example is a cheap dimmer switch
for a light. Frequently, a given angle for the rotating
knob produces one level of brightness when approached
from the left and another brightness approached from the right.
This kind of nonlinear behavior can be exploited in
a dynamical system (eg, the nervous system or a recurrent
connectionist network) to store information about pattern history.
By an appropriate learning process the parameters of
the dynamic system can be adjusted to generate a distinctive
trajectory (through activation space) for familiar patterns.
See Anderson and Port, 1990 and Anderson, Port, McAuley, 91
for some demonstrations of this kind of `dynamic memory'
in networks trained with recurrent backprop.
This kind of representation for familiar sequential patterns
exhibits many standard properties of human hearing. For example,
this kind of representation (unlike a raw time window representation)
is naturally INVARIANT under changes in the RATE of presentation of
the pattern (just as words or tunes are recognized as the same
despite differences in rate of production). It has been shown
that for `Watson patterns', changing the rate of presentation
to listeners during testing by a factor of 2 or so relative to
the rate used during training has no effect whatever on performance.
The same is true of our recurrent networks.
The use of dynamic-memory representations by the nervous system for
environmental sounds exploits the fact that most of the sounds we hear are
very similar to sounds we have heard many times before. Only a minute
fragment of the possible spectrally distinct patterns over time
occur in our environment, so apparently we classify sounds into
a (very large) alphabet of familiar sequences. Watson patterns
are not in this set, so we cannot store them for a second or 2.
But to return to Michael Tepp's original problem. He apparently
has several sampled physiological measures from milk cows
(eg, body temperature, chemical content of the milk, etc)
and hopes to detect the presence of mastitis in the cow as
early as possible. It seems very likely that the onset of
the disease will exhibit differences in rate between instances
of the illness. So, even though keeping a static time-window is
trivial given the rate at which the physiological data
is generated, the distribution of information about the `target
pattern' (whatever it is) across such a window is very likely NOT
to be constant. Thus a hysteresis-based method of pattern
recognition using a dynamic memory might be expected to have more success.
A few refs:
Anderson, Sven, R. Port and Devin McAuley (1991) Dynamic Memory:
a model for auditory pattern recognition. Mspt. But I will make
it available by ftp from neuroprose at Ohio State. We will post a separate
note when it is there.
Port, Robert (1990) Representation and recognition of temporal
patterns. \f2Connection Science,\f1 151-176. This includes some
description of Watson's work and contains more on the argument
against time windows.
Port, Robert and Sven Anderson (1989) Recognition of melody fragments
in continuously performed music. In G. Olson and E. Smith (eds)
\f2Proceedings of the Eleventh Annual Meeting of the Cognitive
Science Society\f1 (L. Erlbaum Assoc, Hillsdale, NJ), pp. 820-827.
Port, R and Tim van Gelder (1991) Representing aspects of language.
Proc of Cog Sci Soc 13, Erlbaum Assoc. Generalizes the notion of
dynamic representations as applied to other kinds of patterns.
More information about the Connectionists
mailing list