XOR and BP
Juergen Schmidhuber
schmidhu at informatik.tu-muenchen.de
Fri Mar 19 11:42:34 EST 1993
David Glaser writes:
>> ..................., but isn't back-prop with a learning rate of 1
>> (see Luis B. Almeida's posting of 15.3.93) doing something quite a lot
>> like random walk ?
Probably not really.
I ran a couple of simulations using the 2-2-1 (+ true unit)
architecture but doing random search in weight space (instead
of backprop). On average, I had to generate 1500 random weight
initializations before hitting the first XOR solution (with a
uniform distribution for each weight between -10.0 and +10.0).
Different architectures and different initialization conditions
influence the average number of trials, of course. Since there
are only 16 mappings from the set of 4 input patterns to a single
binary output, a hypothetical bias-free architecture allowing
only such mappings would require about 16 random search trials
on average. The results above seem to imply that Luis' backprop
procedure had to fight against a `negative' architectural bias.
The success of any learning system depends so much on the right
bias. Of course, there are architectures and corresponding
learning algorithms that solve XOR in a single `epoch'.
Juergen Schmidhuber
Institut fuer Informatik
Technische Universitaet Muenchen
Arcisstr. 21, 8000 Muenchen 2, Germany
schmidhu at informatik.tu-muenchen.de
More information about the Connectionists
mailing list