preprints and reports
Juergen Schmidhuber
schmidhu at informatik.tu-muenchen.dbp.de
Tue Apr 30 03:17:00 EDT 1991
Recent preprints and technical reports are available via ftp:
------------------------------------------------------------------
ADAPTIVE DECOMPOSITION OF TIME
Juergen Schmidhuber, TUM
(Talk at ICANN'91, Helsinki, June 24-28, 1991)
In this paper we introduce design principles for unsupervised
detection of regularities (like causal relationships) in temporal
sequences. One basic idea is to train an adaptive predictor module to
predict future events from past events, and to train an additional
confidence module to model the reliability of the predictor's
predictions. We select system states at those points in time where
there are changes in prediction reliability, and use them
recursively as inputs for higher-level predictors. This can be
beneficial for `adaptive sub-goal generation' as well as for
`conventional' goal-directed (supervised and reinforcement) learning:
Systems based on these design principles were successfully tested
on tasks where conventional training algorithms for recurrent nets fail.
Finally we describe the principles of the first neural sequence `chunker'
which collapses a self-organizing multi-level predictor hierarchy into a
single recurrent network.
LEARNING TO GENERATE SUBGOALS FOR ACTION SEQUENCES
Juergen Schmidhuber, TUM
(Talk at ICANN'91)
This paper extends the technical report FKI-129-90 (`Toward compositional
learning with neural networks').
USING ADAPTIVE SEQUENTIAL NEUROCONTROL FOR EFFICIENT LEARNING
OF TRANSLATION AND ROTATION INVARIANCE
Juergen Schmidhuber and Rudolf Huber, TUM
(Talk at ICANN'91)
This paper is based on FKI-128-90 (announced earlier).
-------------------------------------------------------------------
LEARNING TO CONTROL FAST-WEIGHT MEMORIES: AN ALTERNATIVE TO
DYNAMIC RECURRENT NETWORKS
Juergen Schmidhuber, TUM
Technical report FKI-147-91, March 26, 1991
Previous algorithms for supervised sequence learning are based on
dynamic recurrent networks. This paper describes alternative
gradient-based systems consisting of two feed-forward nets
which learn to deal with temporal sequences by using fast weights:
The first net learns to produce context dependent weight changes
for the second net whose weights may vary very quickly. One
advantage of the method over the more conventional recurrent net
algorithms is the following: It does not necessarily occupy
full-fledged units (experiencing some sort of feedback) for storing
information over time. A simple weight may be sufficient for storing
temporal information. Since with most networks there are many more
weights than units, this property represents a potential for
storage efficiency. Various learning methods are derived. Two
experiments with unknown time delays illustrate the approach. One
experiment shows how the system can be used for adaptive temporary
variable binding.
NEURAL SEQUENCE CHUNKERS
Juergen Schmidhuber, TUM
Technical report FKI-148-91, April 26, 1991
This paper addresses the problem of meaningful hierarchical adaptive
decomposition of temporal sequences. This problem is relevant for
time-series analysis as well as for goal-directed learning. The first
neural systems for recursively chunking sequences are described.
These systems are based on a principle called the `principle of history
compression'. This principle essentially says: As long as a predictor
is able to predict future environmental inputs from previous ones, no
additional knowledge can be obtained by observing these inputs in
reality. Only unpredicted inputs deserve attention. The focus is on
a 2-network system which tries to collapse a self-organizing multi-level
predictor hierarchy into a single recurrent network (the automatizer).
The basic idea is to feed everything that was not expected by the
automatizer into a `higher-level' recurrent net (the chunker). Since
the expected things can be derived from the unexpected things by the
automatizer, the chunker is fed with a reduced description of the input
history. The chunker has a comparatively easy job in finding
possibilities for additional reductions, since it works on a slower
time scale and receives less inputs than the automatizer. Useful
internal representations of the chunker in turn are taught to the
automatizer. This leads to even more reduced input descriptions for
the chunker, and so on. Experimentally it is shown that the system can
be superior to conventional training algorithms for recurrent nets:
It may require fewer computations per time step, and in addition it may
require fewer training sequences. A possible extension for reinforcement
learning and adaptive control is mentioned. An analogy is drawn between
the behavior of the chunking system and the apparent behavior of humans.
ADAPTIVE CONFIDENCE AND ADAPTIVE CURIOSITY
Juergen Schmidhuber
Technical Report FKI-149-91, April, 26, 1991
Much of the recent research on adaptive neuro-control and reinforcement
learning focusses on systems with adaptive `world models'. Previous
approaches, however, do not address the problem of modelling the
reliability of the world model's predictions in uncertain environments.
Furthermore, with previous approaches usually some ad-hoc method
(like random search) is used to train the world model to predict
future environmental inputs from previous inputs and control outputs of
the system. This paper introduces ways for modelling the reliability of
the outputs of adaptive world models, and it describes more sophisticated
and sometimes much more efficient methods for their adaptive construction
by on-line state space exploration: For instance, a 4-network
reinforcement learning system is described which tries to maximize the
future expectation of the temporal derivative of the adaptive assumed
reliability of future predictions. The system is `curious' in the sense
that it actively tries to provoke situations for which it {\em learned
to expect to learn} something about the environment. In a very limited
sense the system learns how to learn. An experiment with a simple
non-deterministic environment demonstrates that the method can be
clearly faster than the conventional model-building strategy.
------------------------------------------------------------------------
To obtain copies of the papers, do:
unix> ftp 131.159.8.35
Name: anonymous
Password: your name, please
ftp> binary
ftp> cd pub/fki
ftp> get <file>.ps.Z
ftp> bye
unix> uncompress <file>.ps.Z
unix> lpr <file>.ps
Here <file> stands for any of the following six possibilities:
icanndec (Adaptive Decomposition of Time)
icannsub (Subgoal-Generator). This paper contains 5 partly
hand-drawn figures which are not retrievable. Sorry.
icanninv (Sequential Neuro-Control).
fki147 (Fast Weights)
fki148 (Sequence Chunkers)
fki149 (Adaptive Curiosity)
Please do not forget to leave your name. This will allow us to save
paper if you are on our hardcopy mailing list.
NOTE: icanninv.ps, fki148.ps, and fki149.ps are designed
for European A4 paper format (20.9cm x 29.6cm).
------------------------------------------------------------------------
In case of ftp-problems contact
Juergen Schmidhuber
Institut fuer Informatik,
Technische Universitaet Muenchen
Arcisstr. 21
8000 Muenchen 2
GERMANY
or send email to
schmidhu at informatik.tu-muenchen.de
DO NOT USE REPLY!
More information about the Connectionists
mailing list