Connections per Second
Scott.Fahlman@B.GP.CS.CMU.EDU
Scott.Fahlman at B.GP.CS.CMU.EDU
Tue Aug 29 20:43:21 EDT 1989
Why not just record the execution time in seconds?
Because implementations that differ by several orders of magnitude in raw
speed cannot be compared using the identical network. For some simulators
it would take forever to measure the runtime of any net with more than 100
connections; for other implementations, that's not even enough to fill up
the pipeline. As long as the time required by a given implementation grows
more or less linearly with the number of connections in the network, CPS
tells you something useful about that implementation: how long it will take
to run some number of passes on a given network. And if two systems
implement the same algorithm (e.g. standard backprop), then the CPS numbers
give you a rough way of comparing them: one Warp equals three
Commodore-64's, or whatever.
* They don't facilitate the performance comparison of
connectionist models and non-connectionist models
(e.g. statistical pattern classifiers).
Are we afraid of such comparisons?
CPS numbers are clearly not useful for comparing two different neural net
algorithms, unless they are very close to one another or unless there is
some formal transformation from one algorithm into another. A Boltzmann
CPS is very different from a backprop CPS or a Hopfield CPS. So of course
there's no way to compare backprop to some statistical classifier using CPS
numbers -- they are only good for comparing members of the same family.
You're right, when it comes to comparing the speed of a given connectionist
model against a statistical model (or two very different connectionist
models), about the only way we can do it is to compare total runtime, on
the same machine, programmed by equally competent hackers, and with the
same degree of optimization for speed vs. flexibility. If any of these
conditions doesn't hold, you can be off by a large factor, but still you
might be able to use the result for a crude order-of-magnitude comparison
of runtimes.
* If CPS numbers become the accepted yard stick
with which to measure execution models then one is
likely to penalize new approaches which are slower
per connection but, overall, require fewer connections.
Usually the tradeoff is between the speed of a single training cycle and
the number of cycles needed, not the size of the net. But in any case, I
think there is no danger as long as we all realize that CPS is useless in
comparing two significiantly different algorithms.
* CPS numbers are very hard to interpret, as we have seen
in the recent discussion about CPS.
That's why we're having this discussion. Maybe we can agree on a common
metric that is useful, at least in a very limited set of circumstances. I
suspect that the CPS numbers currently being tossed around for backprop are
ambiguous by a factor of two, and it would be nice to sort that out.
-- Scott
More information about the Connectionists
mailing list