Tech Report Available
Tom Dietterich
tgd at turing.CS.ORST.EDU
Thu Sep 13 17:08:26 EDT 1990
The following tech report is available in compressed postscript format
from the neuroprose archive at Ohio State.
A Comparison of
ID3 and Backpropagation
for English Text-to-Speech Mapping
Thomas G. Dietterich
Hermann Hild
Ghulum Bakiri
Department of Computer Science
Oregon State University
Corvallis, OR 97331-3102
Abstract
The performance of the error backpropagation (BP) and decision tree
(ID3) learning algorithms was compared on the task of mapping English
text to phonemes and stresses. Under the distributed output code
developed by Sejnowski and Rosenberg, it is shown that BP consistently
out-performs ID3 on this task by several percentage points. Three
hypotheses explaining this difference were explored: (a) ID3 is
overfitting the training data, (b) BP is able to share hidden units
across several output units and hence can learn the output units
better, and (c) BP captures statistical information that ID3 does not.
We conclude that only hypothesis (c) is correct. By augmenting ID3
with a simple statistical learning procedure, the performance of BP
can be approached but not matched. More complex statistical
procedures can improve the performance of both BP and ID3
substantially. A study of the residual errors suggests that there is
still substantial room for improvement in learning methods for
text-to-speech mapping.
This is an expanded version of a short paper that appeared at the
Seventh International Conference on Machine Learning at Austin TX in
June.
To retrieve via FTP, use the following procedure:
unix> ftp cheops.cis.ohio-state.edu # (or ftp 128.146.8.62)
Name (cheops.cis.ohio-state.edu:): anonymous
Password (cheops.cis.ohio-state.edu:anonymous): <ret>
ftp> cd pub/neuroprose
ftp> binary
ftp> get
(remote-file) dietterich.comparison.ps.Z
(local-file) foo.ps.Z
ftp> quit
unix> uncompress foo.ps
unix> lpr -P(your_local_postscript_printer) foo.ps
More information about the Connectionists
mailing list