[Research] Reminder - Brent Bryan Thesis Proposal Today at 11:00AM - 1/29/07]
Jeff Schneider
schneide at cs.cmu.edu
Mon Jan 29 09:37:57 EST 2007
Please join us for Brent's thesis proposal this morning at 11am.
Jeff.
--------------------------------------------------------------
Date: January 29, 2007
Time: 11:00
Place: 1507 NSH
PhD Candidate: Brent Bryan
Title: Active Learning Search Strategies for Computing Level Sets:
Determining Large-Scale Joint Confidence Interval Computations
Abstract:
In many scientific applications, one less interested in determining the
point which maximizes a function, but rather desires to know the set of
all points which are above some specified level set. For example,
consider the task of computing the set of parameters for a given
parameterized model which fit some observed data reasonably
well. If we have an oracle which can tell us fit ``goodness'' as a
function of input parameters, this problem becomes that of determining
the level set which delineates those models that fit well from those
that do not. In astrophysics, we can use this idea to study the early
universe as well as galactic history. By computing confidence intervals
for parameterized models in the first case, astronomers can determine
the age, composition and eventual fate of the universe, while for the
second query astronomers hope to decipher how galaxies form and evolve
over time.
While, several techniques have been developed to compute the level set
of a function, most are either inefficient, or lack convergence
guarantees for finite sample sizes (or both). For instance, one common
way to compute confidence intervals, is to use MCMC. However, MCMC is
known to possibly converge to incorrect solutions with limited chain
sizes. Additionally, MCMC is a procedure for sampling from a posterior
distribution, and as such, it draws a large number of experiments from
regions that are well away from the boundary of the confidence region,
reducing its efficiency. Instead, we propose a frequentist based
technique that combines statistical hypothesis tests with an active
learning framework to specifically explore the confidence region
boundary, while simultaneously sampling the remaining parameter space
sufficiently to prove the convergence of the algorithm with a finite
number of samples.
In this thesis, we will develop an active learning framework to
efficiently compute function level sets. We will demonstrate the
utility of the derived framework by developing a frequentist-based
statistical inference technique to efficiently compute confidence
intervals for model parameters with respect to some observed data. We
then extend this framework to handle the simultaneous computation of
confidence intervals for millions of galaxies. Additionally, we will
develop algorithms to prove that the resulting intervals are correct
within some predefined tolerance.
Thesis Committee:
Jeff Schneider (Chair)
Chris Genovese
Andrew Moore (Google)
Chris Miller (Cerro Tololo Inter-American Observatory, Chile)
Bob Nichol (Institute of Cosmology and Gravitation, University of
Portsmouth)
Larry Wasserman
--
*******************************************************************
Diane Stidle
Business & Graduate Programs Manager
Machine Learning Department
School of Computer Science
4612 Wean Hall
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213-3891
Phone: 412-268-1299
Fax: 412-268-3431
Email: diane at cs.cmu.edu
URL:http://www.ml.cmu.edu
More information about the Autonlab-research
mailing list