[Research] Auton Lab Meeting: Monday July 21 10am NSH 1507

Artur Dubrawski awd at cs.cmu.edu
Thu Jul 17 01:22:58 EDT 2008


Dear Autonians,

We will have a guest speaker, Daria Sorokina from Cornell, who will
give the following, highly relevant talk.

Time and place are given in the subject of this message, coffee/tea and
cookies will be provided on the spot.

Looking forward to seeing you there.
Artur

-------
Title: 
Modeling Additive Structure and Detecting Interactions with Groves of Trees

Abstract:
	A lot of research in machine learning and data mining is concentrated
on building prediction models with the best possible performance. In most
cases such models act as black boxes: they make good predictions, but do not
provide much insight into the decision making process. This is unsatisfactory
for domain scientists who also want to answer questions like: What effects do
important features have on the response variable? Which features are involved
in complex effects and should be studied only together with some other
features? How can we visualize and interpret such complex effects? Separate
post-processing techniques are needed to answer these questions.
	The term statistical interaction is used to describe the presence of
non-additive effects among two or more variables in a function. When
variables interact, their effects must be modeled and interpreted
simultaneously. Thus, detecting statistical interactions can be critical for
an understanding of processes by domain researchers.
	In this talk I will describe an approach to interaction detection
based on comparing the performance of unrestricted and restricted prediction
models, where restricted models are prevented from modeling an interaction in
question. I will present a new algorithm, Additive Groves, that has the right
properties for this framework. Additive Groves is an ensemble of additive
regression trees, based on such techniques as bagging and additive models;
their combination allows us to use large trees in the ensemble and at the
same time model additive structure of the response function. 
	I will demonstrate results of interaction detection analysis on real
data describing the abundance of different species of birds in the prairies
to the east of the southern Rocky Mountains.
	In the second part I will talk more about a regression ensemble
Additive Groves and its classification counterpart, Gradient Groves. I will
show that these algorithms yield consistently high performance across a
variety of problems, outperforming on average a large number of other
algorithms.
	This is joint work with Rich Caruana and Mirek Riedewald.





More information about the Autonlab-research mailing list