Similarity to Cascade-Correlation
Steve Hanson
jose at learning.siemens.com
Thu Aug 9 10:14:39 EDT 1990
thanks for the clarification...
however, as I understand CART, it is not required to
construct an axis-parallel hyperplane (like ID3 etc..), like
CC any hyperplane is possible. Now as I understand CC it does
freeze the weights for each hidden unit once
asymptotic learning takes place and takes as input to
a next candidate hidden unit the frozen hidden unit
output (ie hyperplane decision or discriminant function).
Consequently, CC does not "...keep all of the training
data together and <keeps> retraining the output units (weights?) as it
incrementlly adds hidden units".
As to higher-order hidden units... I guess i see what you mean, however,
don't units below simply send a decision concerning the subset
of data which they have correctly classified? Consequently,
units above see the usual input features and a
newly learned hidden unit feature indicating that a some subset
of the input vectors are on one side of its decision surface? right?
Consequently the next hidden unit in the "cascade" can learn
to ignore that subset of the input space and concentrate on
other parts of the input space that requires yet another hyperplane?
It seems as tho this would produce a branching tree of discriminantS
similar to cart.
n'est pas?
Steve
More information about the Connectionists
mailing list