Connectionists: Results of the ALvsPK challenge

Isabelle Guyon isabelle at clopinet.com
Tue Aug 28 16:06:57 EDT 2007


Results of the Agnostic Learning vs. Prior Knowledge Challenge
-----------------------------------------------------------------
For the first few month of the challenge, AL lead over PK, showing that 
the development of good AL classifiers is considerably faster. As of 
March 1st 2007, PK was leading over AL on four out of five datasets. We 
extended the challenge five more month, but the best performances did 
not significant improve during that time period. On datasets not 
requiring real expert domain knowledge (ADA 
<cid:part1.04030605.05090309 at clopinet.com>, GINA 
<cid:part2.08010605.07070209 at clopinet.com>, SYLVA 
<cid:part3.04090904.06030607 at clopinet.com>), the participants entering 
both track obtained better results in the PK track, using a 
special-purpose coding of the inputs and/or the outputs, exploiting the 
knowledge of which features were uninformative, and using "shared 
weights" for redundantfeatures. For two datasets (HIVA 
<cid:part4.00020606.04040104 at clopinet.com> and NOVA 
<cid:part5.01090501.06040806 at clopinet.com>) the raw data was not in a 
feature representation and required some domain knowledge to preprocess 
data. The winning data representations consist in low level features 
("molecular fingerprints" and "bag of words"). From the analysis of this 
challenge, we conclude that agnostic learning methods are very powerful. 
They quickly yield (in 40 to 60 days) to performances, which are near 
the best achievable performances. General-purpose techniques for 
exploiting prior knowledge in the encoding of inputs or outputs or the 
design of the learning machine architecture (e.g. via shared weights) 
may provide an additional performance boost, but exploiting real domain 
knowledge is both hard and time consuming. The net result of using 
domain knowledge rather using than low level features and relying on 
agnostic learning may actually be to worsen results, as experienced by 
some entrants. This fact seems to be a recurrent theme in machine 
learning publications and the results of our challenge confirm it. 
Future work includes incorporating the best identified methods in our 
challenge toolkit CLOP <http://www.agnostic.inf.ethz.ch/models.php>. The 
challenge web site <http://www.agnostic.inf.ethz.ch/> remains open for 
post-challenge submissions (http://www.agnostic.inf.ethz.ch/)

For more details on the analysis, see: 
http://clopinet.com/isabelle/Projects/agnostic/Results.html.




More information about the Connectionists mailing list