[Research] Fwd: Thesis Defense - Ajit Singh - 9/30/09

Sun Sep 27 23:09:44 EDT 2009

I'll be defending this Wednesday at noon. Everyone's welcome to  
attend (NSH 1507).

-- Ajit

Begin forwarded message:

> From: Diane Stidle <diane+ at cs.cmu.edu>
> Date: September 17, 2009 1:52:23 PM CDT
> To: ml-seminar at cs.cmu.edu
> Subject: Thesis Defense - Ajit Singh - 9/30/09
>
> Thesis Defense
>
> Date: 9/30/09
> Time: 12:00pm
> Place: 1507 NSH
>
> PhD Candidate: Ajit Singh
>
> Title: *Efficient Models for Relational Learning
> *
> Abstract:
> Information integration deals with the setting where one has  
> multiple sources of data, each describing different properties of  
> the same set of entities. We are concerned primarily with settings  
> where the properties are pairwise relations between entities, and  
> attributes of entities. We want to predict the value of relations  
> and attributes, but relations between entities violate the basic  
> statistical assumption of exchangeable data points, or entities.  
> Furthermore, we desire models that scale gracefully as the number  
> of entities and relations increase.
>
> Matrices are the simplest form of relational data; and we begin by  
> distilling the literature on low-rank matrix factorization into a  
> small number of modelling choices. We then frame information  
> integration as simultaneously factoring sets of related matrices:  
> i.e., Collective Matrix Factorization. Each entity is described by  
> a small number of parameters, and if an entity is described by more  
> than one matrix, those parameters participate in multiple matrix  
> factorizations. Maximum likelihood estimation of the resulting  
> model involves a large non-convex optimization, which we reduce to  
> cyclically solving convex optimizations over small subsets of the  
> parameters. Each convex subproblem can be solved by Newton-Raphson,  
> which we extend to the setting of stochastic Newton-Raphson.
>
> To address the limitations of maximum likelihood estimation in  
> matrix factorization models, we extend our approach to the  
> hierarchical Bayesian setting. Here, Bayesian estimation involves  
> computing a high-dimensional integral with no analytic form. If we  
> resorted to standard Metropolis-Hastings techniques, slow mixing  
> would limit the scalability of our approach to large sets of  
> entities. We show how to accelerate Metropolis-Hastings by using  
> our efficient solution for maximum likelihood estimation to guide  
> the sampling process.
>
> This thesis rests on two claims, that (i) that Collective Matrix  
> Factorization can effectively integrate different sources of data  
> to improve prediction; and, (ii) that training scales well as the  
> number of entities and observations increase. Two real-world data  
> sets are considered in experimental support of these claims:  
> augmented collaborative filtering and augmented brain imaging. In  
> augmented collaborative filtering, we show that genre information  
> about movies can be used to increase the predictive accuracy of  
> user's ratings. In augmented brain imaging, we show that word co- 
> occurrence information can be used to increase the predictive  
> accuracy of a model of changes in brain activity to word stimuli,  
> even in regions of the brain that were never included in the  
> training data.
>
> http://www.cs.cmu.edu/~ajit/pubs/ajit-thesis-submitted.pdf
>
> Thesis Committee:
> Geoff Gordon (chair)
> Tom Mitchell
> Christos Faloutsos
> Pedro Domingos (University of Washington)
>
> -- 
> *******************************************************************
> Diane Stidle
> Business & Graduate Programs Manager
> Machine Learning Department
> School of Computer Science
> 8203 Gates Hillman Complex			
> Carnegie Mellon University		
> 5000 Forbes Avenue		
> Pittsburgh, PA  15213-3891
> Phone: 412-268-1299
> Fax:   412-268-3431
> Email: diane at cs.cmu.edu	
> URL:http://www.ml.cmu.edu			
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/autonlab-research/attachments/20090927/1a9d846c/attachment.html>