From ajit at cs.cmu.edu Sun Jun 15 13:23:26 2008 From: ajit at cs.cmu.edu (Ajit Paul Singh) Date: Sun, 15 Jun 2008 13:23:26 -0400 Subject: [Research] Fwd: Thesis Proposal - Ajit Singh - 6/16/08 References: <484580BF.5070907@cs.cmu.edu> Message-ID: <8706BF5A-7A7A-41E9-86D5-CF6ED9F1BAA2@cs.cmu.edu> Begin forwarded message: > From: Diane Stidle > Date: June 3, 2008 1:34:55 PM EDT > To: ml-seminar at cs.cmu.edu, Pedro Domingos > , Tom Mitchell , > Christos , Geoff > Cc: Steve > Subject: Thesis Proposal - Ajit Singh - 6/16/08 > > Thesis Proposal - Ajit Singh > > Date: June 16, 2008 > Time: 3:00pm > Place: 3305 Newell-Simon Hall > > Title: Efficient Models for Relational Learning > > Abstract: > The primary difference between propositional (attribute-value) and > relational data is the existence of relations, or links, between > entities. Graphs, relational databases, sets of tensors, and first- > order knowledge bases are all examples of relational encodings. > Because of the relations between entities, standard statistical > assumptions, such as independence of entities, is violated. > Moreover, these correlations should not be ignored as they provide > a source of > information that can significantly improve the accuracy of common > machine learning tasks (e.g., prediction, clustering) over > propositional alternatives. A current limitation in relational > models is that learning and inference are often substantially more > expensive than propositional alternatives. One of our objectives > is the development of models that account for uncertainty in > relational data while scaling to very large data sets, which often > cannot fit in main > memory. To that end, we propose representing relational data as a > set of tensors, one per relation, whose dimensions index different > entity types in the data set. Each tensor has a low-dimensional > approximation, where they share a low-dimensional factor for each > shared entity-type. For the case of matrices, we refer to this > model as collective matrix factorization. > > While existing techniques for relational learning assume a batch of > data, we propose exploring extensions to active and mixed > initiative learning, where the learning algorithm can query its > environment (typically a human user) about relationships between > entities, the creation of new predicates, and relationships between > predicates themselves. It is our belief that the expressiveness of > relational representations will allow for more efficient > interaction between the > learner and its environment, as well as leading to better > predictive models for relational data. Efficiency refers not only > to computational efficiency, but also to the efficiency of data > collection in active learning scenarios. To support the claim that > our models are efficient, we propose exploring three problems: > predicting user's ratings of movies with side information, topic > models for text using fMRI images of neural activation on words, > and mixed initiative tagging of e-mail and other information used > by personal information managers---e.g., tasks from todo lists, > recently > edited files, and calendar entries. > > The proposal document is found at: http://www.cs.cmu.edu/~ajit/pubs/ > proposal.pdf > > Thesis Committee: > Geoffrey Gordon (Chair) > Christos Faloutsos > Tom Mitchell > Pedro Domingos (Univ. of Washington) > > > -- > ******************************************************************* > Diane Stidle > Business & Graduate Programs Manager > Machine Learning Department > School of Computer Science > 4612 Wean Hall > Carnegie Mellon University > 5000 Forbes Avenue > Pittsburgh, PA 15213-3891 > Phone: 412-268-1299 > Fax: 412-268-3431 > Email: diane at cs.cmu.edu > URL:http://www.ml.cmu.edu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From awd at cs.cmu.edu Mon Jun 23 16:11:00 2008 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Mon, 23 Jun 2008 16:11:00 -0400 Subject: [Research] Auto Lab meeting this Friday Message-ID: <48600354.4030402@cs.cmu.edu> Once is nice, twice is better! We are going to take part in an experiment validating that statement on Friday 6/27 at 12:30 in NSH 1507, when not just one but two talks will be presented, each by a distinguished member of our distinguished team. The titles are given below. It should be easy to associate these titles with the respective speaker names. If you cannot do that... well then clearly you have one more reason to never skip our lab meetings. Food will be provided. As well as: "Fast Incremental Nearest Neighbor Search in Large Graphs" and "Actively Learning Level-Set of Composite Functions" See you all, Artur