No subject

Fri Jan 31 09:55:54 EST 1992

    	I am working on the retention of knowledge by a neural network.
    A neural network tends to forget the past training when it is trained on
    new data points...

Geoff Hinton and his students did some early work on damaging nets and then
re-training, and also on the use of fast-changing and slow-changing
weights.  Perhaps he can supply some references to this work and related
work done by others.

Cascade-Correlation has some interesting properties related to retention.
If you train and then switch to a new training set, you mess up the
output-layer weights, but you retain all of the old feature detectors
(hidden units), and maybe build some new ones for the new data.  Then if
you return to the old training set or a composite set, re-learning is
generally quite fast.

This is demonstrated in my Recurrent Cascade-Correlation (RCC) paper that
can be found in NIPS-3 or in neuroprose.  I train a net to recognize Morse
code by breaking the training into distinct (not cumulative) lessons,
starting with the shortest codes, and then training on all codes at once.
This works better than training on all codes right from the start.

This opens up the possibility of a sort of long-lived, eclectic net that is
trained in many diferent domains over its "lifetime" and that gradually
accumulates a rich library of useful feature detectors.  The current
version of Cascor wouldn't be very good for this, since later hidden units
would be too deep and would have too many inputs, but I think that this
problem of excess connectivity is may be easily solvable.

-- Scott

Scott E. Fahlman
School of Computer Science
Carnegie Mellon University