Why does the error rise in a SRN?

Thu Apr 2 15:34:52 EST 1992

If you are really using the SRN "Elman-style", then you
are not calculating the true gradient.  Thus it is not at
all surprising that the error might increase as you follow
a false gradient.  To calculate the true gradient you
need to do the full backpropagation in time.  Also, the
behaviour you describe of the error first going down and
then going up is quite likely, as in the beginning the
gradients are large and the false and true gradients are more
likely to be pointing in approximately the same direction.

Tony Plate