A Harder Learning Problem

Thu Sep 1 17:05:02 EDT 1988

This is a (delayed) response to Alexis P. Wieland's posting of Fri Aug  5
on the spiral problem:  _A Harder Learning Problem_ : 

>  One of the tasks that we've been using at MITRE to test and compare our 
>  learning algorithms is to distinguish between two intertwined spirals.
>  This task uses a net with 2 inputs and 1 output.  The inputs correspond
>  to <x,y> points, and the net should output a 1 on one spiral and
>  a 0 on the other.  Each of the spirals contains 3 full revolutions.

>  This task has some nice features: it's very non-linear, it's relatively
>  difficult (our spiffed up learning algorithm requires ~15-20 million
>  presentations = ~150-200 thousand epochs = ~1-2 days of cpu on a (loaded)
>  Sun4/280 to learn, ... we've never succeeded at getting vanilla bp to
>  correctly converge), and because you have 2 in and 1 out you can *PLOT* 
>  the current transfer function of the entire network as it learns.
>  
>  I'd be interested in seeing other people try this or a related problem.

Here at SAIC, Dennis Walker obtained the following results:

"I tried the spiral problem using the standard Back Propagation
model in ANSim (Artificial Neural System Simulation Environment)
and found that neither spiffed-up learning algorithms nor tricky
learning rate adjustments are necessary to find a solution to this
difficult problem.  Our network had two hidden layers -- a 2-20-10-1
structure for a total of 281 weights.  No intra-layer connections
were necessary.  The learning rates for all 3 layers were set to 0.1
with the momentums set to 0.7.  Batching was used for weight updating.
Also, an error tolerance of 0.15 was used: as long as the output was
within 0.15 of the target no error was assigned.  It took ANSim 13,940
cycles (passes through the data) to get the outputs within 0.3 of the
targets.  (In ANSim, the activations range from -0.5 to 0.5 instead of
the usual 0 to 1 range.)  Using the SAIC Delta Floating Point Processor
with ANSim, this took less than 27 minutes to train (~0.114 seconds/pass).
I also tried reducing the network size to 2-16-8-1 and again was able to
train the network successfully, but it took an unbelievable 300K cycles!
This is definitly a tough problem."

Stephen A. Frostrom
Science Applications International Corporation
10260 Campus Point Drive
San Diego, CA 92121
(619) 546-6404
frostroms at SAIC-CPVB.arpa