"Universal Approximators"
John Merrill
merrill at bucasb.BU.EDU
Tue Nov 14 12:30:26 EST 1989
YlC> Yann le Cun
JM> John Merrill (me)
JM> In fact, the radial basis function theorem gives somewhat better
JM> bounds on the *number* of intermediate nodes necessary, and, as a
JM> consequence, indicates that if you're only interested in
JM> approximation, you want to use RBF's.
YlC> But you can build RBF's (or rather, multidimensional "bumps")
YlC> with two layers of sigmoid units. And you just need O(n) units
YlC> for n bumps. A third (linear) layer can sum up the bumps.
YlC> So i guess the bound on the number of intermediate nodes is the
YlC> same in both cases.
Not quite. If your input lies in ${\bf R}^k$, it takes at least k units
(in the lower hidden level) to build a single k-dimensional bump in
the upper hidden level.(*) As a consequence, the network that this
argument gives is wildly more computationally demanding than the
original RBF network, since it's got to have $nk^2$ edges between the
input layer and the first hidden layer, as well as $nk$ more edges
between the two hidden layers, for a total edge count of $n (k^2 +
k)$, as compared to an edge count of $nk$ for the RBF network. If $k
= 1000$, this is a factor of $1000$ more costly (in terms of edges;
it's actually less than that, since each RBF node needs an extra few
multiplies.)
YlC> You can prove the universality of 2 hidden layer structures that
YlC> way (and the proof is a lot simpler than with 1 hidden layer).
Absolutely, and it's the one I would actually teach if I taught one at
all.
--- John
(*) It might seem that one would need 2k such units; in fact, by using
k-dimensional simplices as the bump-shapes, k units suffice.
More information about the Connectionists
mailing list