[Research] [auton-users] proposal talk.

Artur Dubrawski awd at cs.cmu.edu
Fri Aug 27 11:58:25 EDT 2010


Figures :)

It is good to virtually see you though!

On 8/27/2010 11:50 AM, Paul Komarek wrote:
> I have a nail appointment that day.
>
> On Fri, Aug 27, 2010 at 8:49 AM, Artur Dubrawski<awd at cs.cmu.edu>  wrote:
>> You're not coming Paul???
>>
>>
>> On 8/27/2010 11:48 AM, Paul Komarek wrote:
>>>
>>> good luck Robin!
>>>
>>> On Fri, Aug 27, 2010 at 7:15 AM, Robin Sabhnani<sabhnani+ at cs.cmu.edu>
>>>   wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I am giving my thesis proposal talk this afternoon. You are welcome to
>>>> attend it. See announcement below.
>>>>
>>>> ####################
>>>>
>>>> Date: 8/27/10
>>>> Time: 3:00pm
>>>> Place: 4405 GHC
>>>>
>>>> PhD Candidate: Maheshkumar (Robin) Sabhnani
>>>>
>>>> Title: Disjunctive Anomaly Detection: Identifying Complex Anomalous
>>>> Patterns
>>>>
>>>> Abstract:
>>>>
>>>> The problem of anomaly detection in multivariate time series data is
>>>> common to many applications of practical interest. A few examples
>>>> include network intrusion detection systems, manufacturing processes,
>>>> climate studies, syndromic surveillance, video stream processing, etc.
>>>> Our motivating application is syndromic surveillance that aims to detect
>>>> potential disease outbreaks in pre-diagnosis data to facilitate timely
>>>> public health response. To achieve this goal, efficient data structures
>>>> and smart algorithms are needed to analyze highly multivariate temporal
>>>> data.
>>>>
>>>> In this thesis work, we introduce Disjunctive Anomaly Detection (DAD),an
>>>> algorithm for detecting complex anomalous clusters in multivariate
>>>> datasets with categorical dimensions. Our proposed algorithm assumes
>>>> that an anomalous cluster can affect any subset data dimensions (using
>>>> conjunctions) and any subset of values (using disjunctions) along each
>>>> data dimension. We believe that such a cluster definition is more
>>>> informative of the real outbreaks as compared to the current approaches.
>>>> In addition, the DAD algorithm models multiple anomalous clusters
>>>> simultaneously, hence promising better detection power in the presence
>>>> of multiple overlapping anomalous events. So far, we have compared DAD
>>>> algorithm against the relevant powerful alternatives on two important
>>>> tasks: finding sample-variable associations in cancer microarray data,
>>>> and searching for the emerging disease outbreaks in public health data.
>>>> Experimental results indicate that DAD is able to detect and explain
>>>> complex anomalous clusters better than the alternative approaches such
>>>> as the Large Average Submatrix (LAS) algorithm and the What's Strange
>>>> About Recent Events (WSARE) algorithm.
>>>>
>>>> To assist in the development of future complex multidimensional and
>>>> multivariate algorithms (including extensions to DAD),we also introduce
>>>> the T-Cube data structure that efficiently represents any time series
>>>> data with multiple categorical dimensions (typical in many fields of
>>>> application including surveillance). The T-Cube data structure (inspired
>>>> from AD-Trees for categorical count data) acts as a cache and quickly
>>>> responds to any ad-hoc queries during an investigation. It enables
>>>> processing of millions of time series during massive data mining
>>>> operations.We have successfully applied T-Cube to mine interesting
>>>> patterns in diverse projects involving temporal event data.
>>>>
>>>> Thesis Committee:
>>>> Artur Dubrawski (Co-chair)
>>>> Jeff Schneider (Co-chair)
>>>> Aarti Singh
>>>> Greg Cooper (University of Pittsburgh)
>>>> _______________________________________________
>>>> Research mailing list
>>>> Research at autonlab.org
>>>> https://www.autonlab.org/mailman/listinfo/research
>>>>
>>>
>>
>



More information about the Autonlab-research mailing list