[Intelligence Seminar] Nov. 15, 3:30pm:, Presentation by Mark Dredze

Dana Houston dhouston at cs.cmu.edu
Mon Nov 14 12:17:08 EST 2011




> INTELLIGENCE SEMINAR
> NOVEMBER 15 AT 3:30PM, IN GHC 4303
>
> SPEAKER: MARK DREDZE (Johns Hopkins University)
> Host: Carolyn Penstein Rose
> For meetings, contact Dana Houston (dhouston at cs.cmu.edu)
>
> TOPIC MODELS FOR MINING PUBLIC HEALTH INFORMATION FROM TWITTER
>
> Twitter and other social media sites contain a wealth of information 
> about
> populations and have been used to track sentiment towards products, 
> measure
> political attitudes, and study social linguistics. In this talk, we
> investigate the potential for Twitter to impact public health research.
> Specifically, we consider population surveillance, a major focus of 
> public
> health that typically depends on clinical encounters with health
> professionals to collect patient data. Individual users often broadcast
> salient health information, such as "sick with this flu fever taking over
> my body ughhhh time for tylenol", which indicates that not only does this
> person have the flu, but also a fever and is self-medicating with 
> tylenol.
> Aggregating such content across millions of users could provide
> information about numerous aspects of illnesses in the population.
>
> In this work we present the Ailment Topic Aspect Model (ATAM), a new
> Bayesian graphical model for Twitter that associates symptoms, 
> treatments,
> and general words with diseases (ailments). When applied to 1.6 million
> health-related tweets, ATAM discovers descriptions of diseases in 
> terms of
> collections of words (symptoms and treatments) and partitions messages
> based on the referenced disease. The model discovers diseases
> corresponding to influenza, infections, obesity, insomnia, and several
> others. Furthermore, we demonstrate the effectiveness of this model at
> several tasks: tracking illnesses over times (syndromic surveillance),
> measuring behavioral risk factors, localizing illnesses by geographic
> region, and analyzing symptoms and medication usage. We show quantitative
> correlations with public health data and qualitative evaluations of model
> output. Our results suggest that Twitter has broad applicability for
> public health research.
>
> BIO
>
> Mark Dredze is an Assistant Research Professor in Computer Science at
> Johns Hopkins University, as well as a member of the Center for Language
> and Speech Processing and the Human Language Technology Center of
> Excellence. His research in natural language processing and machine
> learning has focused on graphical models, semi-supervised learning,
> information extraction, large-scale learning, speech processing, and
> health informatics. He obtained his PhD from the University of
> Pennsylvania in 2009.
>

-- 
Dana M. Houston
Language Technologies Institute
School of Computer Science
Carnegie Mellon University
5405 Gates Hillman Complex
5000 Forbes Avenue
Pittsburgh, PA 15213

T:  (412)268-4717
F:  (412)268-6298



More information about the intelligence-seminar-announce mailing list