[Research] Fwd: Thesis Proposal - Ajit Singh - 6/16/08
Ajit Paul Singh
ajit at cs.cmu.edu
Sun Jun 15 13:23:26 EDT 2008
Begin forwarded message:
> From: Diane Stidle <diane+ at cs.cmu.edu>
> Date: June 3, 2008 1:34:55 PM EDT
> To: ml-seminar at cs.cmu.edu, Pedro Domingos
> <pedrod at cs.washington.edu>, Tom Mitchell <Tom.Mitchell at cs.cmu.edu>,
> Christos <christos at cs.cmu.edu>, Geoff <ggordon at cs.cmu.edu>
> Cc: Steve <fienberg at stat.cmu.edu>
> Subject: Thesis Proposal - Ajit Singh - 6/16/08
>
> Thesis Proposal - Ajit Singh
>
> Date: June 16, 2008
> Time: 3:00pm
> Place: 3305 Newell-Simon Hall
>
> Title: Efficient Models for Relational Learning
>
> Abstract:
> The primary difference between propositional (attribute-value) and
> relational data is the existence of relations, or links, between
> entities. Graphs, relational databases, sets of tensors, and first-
> order knowledge bases are all examples of relational encodings.
> Because of the relations between entities, standard statistical
> assumptions, such as independence of entities, is violated.
> Moreover, these correlations should not be ignored as they provide
> a source of
> information that can significantly improve the accuracy of common
> machine learning tasks (e.g., prediction, clustering) over
> propositional alternatives. A current limitation in relational
> models is that learning and inference are often substantially more
> expensive than propositional alternatives. One of our objectives
> is the development of models that account for uncertainty in
> relational data while scaling to very large data sets, which often
> cannot fit in main
> memory. To that end, we propose representing relational data as a
> set of tensors, one per relation, whose dimensions index different
> entity types in the data set. Each tensor has a low-dimensional
> approximation, where they share a low-dimensional factor for each
> shared entity-type. For the case of matrices, we refer to this
> model as collective matrix factorization.
>
> While existing techniques for relational learning assume a batch of
> data, we propose exploring extensions to active and mixed
> initiative learning, where the learning algorithm can query its
> environment (typically a human user) about relationships between
> entities, the creation of new predicates, and relationships between
> predicates themselves. It is our belief that the expressiveness of
> relational representations will allow for more efficient
> interaction between the
> learner and its environment, as well as leading to better
> predictive models for relational data. Efficiency refers not only
> to computational efficiency, but also to the efficiency of data
> collection in active learning scenarios. To support the claim that
> our models are efficient, we propose exploring three problems:
> predicting user's ratings of movies with side information, topic
> models for text using fMRI images of neural activation on words,
> and mixed initiative tagging of e-mail and other information used
> by personal information managers---e.g., tasks from todo lists,
> recently
> edited files, and calendar entries.
>
> The proposal document is found at: http://www.cs.cmu.edu/~ajit/pubs/
> proposal.pdf
>
> Thesis Committee:
> Geoffrey Gordon (Chair)
> Christos Faloutsos
> Tom Mitchell
> Pedro Domingos (Univ. of Washington)
>
>
> --
> *******************************************************************
> Diane Stidle
> Business & Graduate Programs Manager
> Machine Learning Department
> School of Computer Science
> 4612 Wean Hall
> Carnegie Mellon University
> 5000 Forbes Avenue
> Pittsburgh, PA 15213-3891
> Phone: 412-268-1299
> Fax: 412-268-3431
> Email: diane at cs.cmu.edu
> URL:http://www.ml.cmu.edu
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.srv.cs.cmu.edu/mailman/private/autonlab-research/attachments/20080615/ec7fc4f1/attachment.html>
More information about the Autonlab-research
mailing list