From nathan at cmu.edu Thu Oct 28 19:34:42 2010 From: nathan at cmu.edu (Nathan Schneider) Date: Thu, 28 Oct 2010 19:34:42 -0400 Subject: [CL+NLP Lunch] CL+NLP Lunch: Brian MacWhinney (11/3) Message-ID: Dear all, Brian MacWhinney will speak about models of language learning next Wednesday for the second meeting of LTI's seminar on Computational Linguistics and Natural Language Processing. Details about the talk are below. Cheers, Nathan & Ben -- CL+NLP Lunch* Wednesday, November 3 NSH 3305* Food will be served at 11:45; the talk will begin promptly at noon. *Brian MacWhinney* Professor, CMU Psychology Department *Item-based patterns, computation, and the brain* *Abstract:* Young children build up sentences by combining words into clusters. Unification grammars such as HPSG, LFG, or Minimalism recognize the importance of such clusters, but rely on combinations of part of speech categories whose development is never explained. The alternative approach to clustering that I have developed emphasizes the role of item-based patterns in early acquisition. These patterns are initially specific to individual lexical operators such as ?more?, ?my? or ?want?. Children then induce higher-level feature-based patterns through feature pruning, much as in the theory of Hierarchical Bayesian Models. A left-associative processor can use patterns on these various levels to generate the required sentence patterns of the target language. In this talk, I will: 1. review developmental evidence for the shift from item-based to feature-based patterns; 2. explain how this shift provides a solution to the Logical Problem of Language Acquisition; 3. examine recent work in computational modeling of language learning and show why it needs to pay more attention to the shift from item-based to feature-based patterns; and 4. link the theory of item-based patterns to core facts about language processing in the brain. *Bio:* Brian MacWhinney is Professor of Psychology, Computational Linguistics, and Modern Languages at Carnegie Mellon University. He has developed a model of first and second language processing and acquisition based on competition between item-based patterns. Data for these models come from the CHILDES (Child Language Data Exchange System) database, which he has developed. He is now extending this spoken language database system to six additional research areas in the form of the TalkBank Project. MacWhinney?s recent work includes studies of online learning of second language vocabulary and grammar, neural network modeling of lexical development, fMRI studies of children with focal brain lesions, and ERP studies of between-language competition. He is also exploring the role of grammatical constructions in the marking of perspective shifting and the construction of mental models in scientific reasoning. http://www.cs.cmu.edu/~nlp-lunch/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/nlp-lunch/attachments/20101028/a2d7847a/attachment.html From benlambert at cmu.edu Tue Nov 2 14:24:01 2010 From: benlambert at cmu.edu (Benjamin Lambert) Date: Tue, 2 Nov 2010 14:24:01 -0400 Subject: [CL+NLP Lunch] CL+NLP Lunch: Brian MacWhinney (11/3) [REMINDER!] Message-ID: <1A070A7B-7E7F-4AAE-BFBC-457EE96F47EE@cmu.edu> Dear all, This is a reminder that Brian MacWhinney will speak about models of language learning *tomorrow* for the second meeting of LTI's NLP lunch! Details about the talk are below. Cheers, Nathan & Ben -- CL+NLP Lunch Wednesday, November 3 NSH 3305 Food will be served at 11:45; the talk will begin promptly at noon. Brian MacWhinney Professor, CMU Psychology Department Item-based patterns, computation, and the brain Abstract: Young children build up sentences by combining words into clusters. Unification grammars such as HPSG, LFG, or Minimalism recognize the importance of such clusters, but rely on combinations of part of speech categories whose development is never explained. The alternative approach to clustering that I have developed emphasizes the role of item-based patterns in early acquisition. These patterns are initially specific to individual lexical operators such as ?more?, ?my? or ?want?. Children then induce higher-level feature-based patterns through feature pruning, much as in the theory of Hierarchical Bayesian Models. A left-associative processor can use patterns on these various levels to generate the required sentence patterns of the target language. In this talk, I will: 1. review developmental evidence for the shift from item-based to feature-based patterns; 2. explain how this shift provides a solution to the Logical Problem of Language Acquisition; 3. examine recent work in computational modeling of language learning and show why it needs to pay more attention to the shift from item-based to feature-based patterns; and 4. link the theory of item-based patterns to core facts about language processing in the brain. Bio: Brian MacWhinney is Professor of Psychology, Computational Linguistics, and Modern Languages at Carnegie Mellon University. He has developed a model of first and second language processing and acquisition based on competition between item-based patterns. Data for these models come from the CHILDES (Child Language Data Exchange System) database, which he has developed. He is now extending this spoken language database system to six additional research areas in the form of the TalkBank Project. MacWhinney?s recent work includes studies of online learning of second language vocabulary and grammar, neural network modeling of lexical development, fMRI studies of children with focal brain lesions, and ERP studies of between-language competition. He is also exploring the role of grammatical constructions in the marking of perspective shifting and the construction of mental models in scientific reasoning. http://www.cs.cmu.edu/~nlp-lunch/ -- Benjamin Lambert Ph.D. Student of Computer Science Carnegie Mellon University www.cs.cmu.edu/~belamber Mobile: 617-869-1844 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/nlp-lunch/attachments/20101102/c94af591/attachment.html From nathan at cmu.edu Wed Nov 24 23:12:49 2010 From: nathan at cmu.edu (Nathan Schneider) Date: Wed, 24 Nov 2010 23:12:49 -0500 Subject: [CL+NLP Lunch] CL+NLP Lunch: Jacob Eisenstein (12/1) Message-ID: Everyone, This is a heads-up that next Wednesday, Dec. 1 will be the third CL+NLP Lunch Seminar. Jacob Eisenstein from the Machine Learning Department will present work on modeling sociolinguistic aspects of social media text (watch for a more detailed abstract early next week). It will be in GHC 4405 at noon. Enjoy the holiday, Nathan & Ben From benlambert at cmu.edu Tue Nov 30 03:27:03 2010 From: benlambert at cmu.edu (Benjamin Lambert) Date: Tue, 30 Nov 2010 03:27:03 -0500 Subject: [CL+NLP Lunch] CL+NLP Lunch this Wed: Jacob Eisenstein on linguistic varation Message-ID: <36A22FE7-5A99-402A-8D76-731947A25395@cmu.edu> Dear everyone, This is a reminder that the Computational Linguistics and Natural Language Processing lunch will be held this Wednesday at noon in GHC 4405. Jacob Eisenstein will be speaking about studying geographic linguistic variation from text, details below. Lunch will be served, as usual :-). Best, Nathan & Ben -- CL+NLP Lunch Wednesday, December 1 GHC 4405 12pm-1:30pm Jacob Eisenstein Postdoc, CMU Machine Learning Department TITLE: Large-Scale Dialectology and Sociolinguistics from Social Media ABSTRACT: Sociolinguistics and dialectology study how language varies across socially-distinct groups of speakers. While these fields feature a strong quantitative tradition, the standard methodology requires the researcher to specify the linguistic dimensions of variability in advance -- before correlating them against extra-linguistic factors. Moreover, much of this work depends on interviews for gathering data, raising problematic issues of how to elicit "truly" vernacular speech. However, the rapid growth of social media offers exciting new possibilities for the study of socially-oriented linguistic variation. Using a new corpus of geo-tagged text from Twitter, we have developed two computational techniques for studying linguistic variation from raw text. These methods are capable of identifying both coherent linguistic communities as well as specific lexical features that distinguish social and geographical groups. Applying these methods to Twitter, we have discovered new and robust lexical-geographic relationships that were undocumented in prior work. In addition, we are able to use raw text to accurately predict metadata such as the geographic location of social media content authors. Speaker bio: Jacob Eisenstein is a postdoctoral fellow in the Machine Learning Department at Carnegie Mellon University. His research focuses on machine learning for discourse, non-verbal communication, and social media. Jacob completed his Ph.D. at MIT in 2008, winning the George M. Sprowls award for his dissertation, ?Gesture in Automatic Discourse Processing.? http://www.cs.cmu.edu/~nlp-lunch/ -- Benjamin Lambert Ph.D. Student of Computer Science Carnegie Mellon University www.cs.cmu.edu/~belamber Mobile: 617-869-1844 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/nlp-lunch/attachments/20101130/f8ee8ab8/attachment.html