Similarity to Cascade-Correlation

Thu Aug 9 10:14:39 EDT 1990

thanks for the clarification...

however, as I understand CART, it is not required to
construct an axis-parallel hyperplane (like ID3 etc..), like 
CC any hyperplane is possible.  Now as I understand CC it does 
freeze the weights for each hidden unit once
asymptotic learning takes place and takes as input to
a next candidate hidden unit the frozen hidden unit
output (ie hyperplane decision or discriminant function).
Consequently, CC does not "...keep all of the training 
data together and <keeps> retraining the output units (weights?) as it 
incrementlly adds hidden units".

As to higher-order hidden units... I guess i see what you mean, however,
don't units below simply send a decision concerning the subset
of data which they have correctly classified?  Consequently,
units above see the usual input features and a 
newly learned hidden unit feature indicating that a some subset 
of the input vectors are on one side of its decision surface? right?
Consequently the next hidden unit in the "cascade" can learn
to ignore that subset of the input space and concentrate on
other parts of the input space that requires yet another hyperplane?  
It seems as tho this would produce a branching tree of discriminantS
similar to cart.

n'est pas?

	Steve