Higher-order recurrent neural networks
Lee Giles
giles at research.nec.com
Tue Nov 26 18:03:57 EST 1991
More references for higher-order recurrent nets and some general comments:
John Kolen mentions:
*****************************************
Higher order recurrent networks are recurrent networks with higher order
connections, (i[1]*i[2]*w[1,2] instead of i[1]*w[1]). An example of a
high order recurent network is Pollack's sequential cascaded networks
which appear, I believe, in the latest issue of Machine Learning. This
network can be described as two three-dimensional matrices, W and V, and
the following equations.
O[t] = Sigmoid( (W . S[t]) . I[t])
S[t+1]=Sigmoid( (V . S[t]) . I[t])
where I[t] is the input vector, O[t] is the output vector, and S[t] is the
state vector, each at time t. ( . is inner product)
**********************************************
For other references on higher-order recurrent nets, see the following:
(This list is not meant to be inclusive, but to give some
flavor of the diversity of work in this area.)
Y.C. Lee, et.al,1986, Physica D.
H.H. Chen, et.al, 1986, AIP conference proceedings on Neural Networks
for Computing
F. Pineda, 1988, AIP conference proceedings for NIPS
Psaltis, et.al, 1988, Neural Networks.
Giles, et al. 1990, NIPS2; and 1991 IJCNN proceedings,
Neural Computation, 1992.
Mozer and Bachrach, Machine Learning 1991
Hush, et.al., 1991 Proceedings for Neural Networks for
Signal Processing.
Watrous and Kuhn, 1992 Neural Computation
In particular the work by Giles, et.al. describes a 2nd order
forward-propagation RTRL to learn grammars from grammatical strings.*
What may be of interest is that using a heuristic extraction method,
one can extract the "learned" grammar from the the recurrent network
both during and after training.
It's worth noting that higher-order nets usually include
sub-orders as special cases, i.e. 2nd includes 1st.
In addition, sigma-pi units are just a subset of higher-order
models and in some cases do not have the computational representative
power of higher-order models. For example, the term (using Kolen's
notation above)
S[i,t] . I[j,t]
would have the same weight coefficient in the original
sigma-pi notation as the term
S[j,t] . I[i,t].
Higher-order notation would distinguish between these terms
using the tensor weights W[k,i,j] and W[k,j,i].
*(Similar work has been done by Watrous & Kuhn and Pollack)
C. Lee Giles
NEC Research Institute
4 Independence Way
Princeton, NJ 08540
USA
Internet: giles at research.nj.nec.com
UUCP: princeton!nec!giles
PHONE: (609) 951-2642
FAX: (609) 951-2482
More information about the Connectionists
mailing list