Thesis available: Constrained weight nets

es2029@eng.warwick.ac.uk es2029 at eng.warwick.ac.uk
Fri Sep 27 09:00:32 EDT 1996


The following PhD Thesis is available on the web:

----------------------------------------------------
Feedforward Neural Networks with Constrained Weights
----------------------------------------------------

Altaf H. Khan
(Email address effective 7 Oct 96 a.h.khan at ieee.org)

Department of Engineering, University of Warwick,
Coventry, CV4 7AL, England 

August 1996

218 pages - gzipped postscript version available as

  http://www.eng.warwick.ac.uk/~es2029/thesis.ps.gz

This thesis will also be made available on Neuropose
in the near future.

----------------------------------------------------

Thesis Summary

The conventional multilayer feedforward network having
continuous-weights is expensive to implement in digital
hardware. Two new types of networks are proposed which lend
themselves to cost-effective implementations in hardware and
have a fast forward-pass capability. These two differ from the
conventional model in having extra constraints on their
weights: the first allows its weights to take integer values in
the range [-3, 3] only, whereas the second restricts its
synapses to the set {-1,0,1} while allowing unrestricted
offsets. The benefits of the first configuration are in having
weights which are only 3-bits deep and a multiplication
operation requiring a maximum of one shift, one add, and one
sign-change instruction. The advantages of the second are in
having 1-bit synapses and a multiplication operation which
consists of a single sign-change instruction. 

The procedure proposed for training these networks starts like
the conventional error backpropagation procedure, but becomes
more and more discretised in its behaviour as the network gets
closer to an error minimum. Mainly based on steepest descent,
it also has a perturbation mechanism to avoid getting trapped
in local minima, and a novel mechanism for rounding off `near
integers'. It incorporates weight elimination implicitly, which
simplifies the choice of the start-up network configuration for
training. 

It is shown that the integer-weight network, although lacking
the universal approximation capability, can implement learning
tasks, especially classification tasks, to acceptable
accuracies. A new theoretical result is presented which shows
that the multiplier-free network is a universal approximator
over the space of continuous functions of one variable. In
light of experimental results it is conjectured that the same
is true for functions of many variables.

Decision and error surfaces are used to explore the
discrete-weight approximation of continuous-weight networks
using discretisation schemes other than integer weights. The
results suggest that provided a suitable discretisation
interval is chosen, a discrete-weight network can be found
which performs as well as a continuous-weight networks, but
that it may require more hidden neurons than its conventional
counterpart. 

Experiments are performed to compare the generalisation
performances of the new networks with that of the conventional
one using three very different benchmarks: the MONK's
benchmark, a set of artificial tasks designed to compare the
capabilities of learning algorithms, the `onset of diabetes
mellitus' prediction data set, a realistic set with very noisy
attributes, and finally the handwritten numeral recognition
database, a realistic but very structured data set. The results
indicate that the new networks, despite having strong
constraints on their weights, have generalisation performances
similar to that of their conventional counterparts.

-- 
Altaf.




More information about the Connectionists mailing list