From mjbaysek at cs.cmu.edu Mon Aug 2 10:37:14 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 02 Aug 2010 10:37:14 -0400 Subject: [auton-users] Server Maintenance Today after 5PM Message-ID: <4C56D81A.1080703@cs.cmu.edu> Today, after 5PM our file server will be going down for some maintenance. Be sure to save all of your work. Any compute jobs that are in process will resume automatically when the system comes back up. Services affected during the maintenance window: * Compute nodes will be unavailable for new logins. * All data on /auton space will be off line. * Desktops still relying on /auton for home directories will not be available. From mjbaysek at cs.cmu.edu Mon Aug 2 20:38:37 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 02 Aug 2010 20:38:37 -0400 Subject: [auton-users] Auton Lab System Maintenance Complete Message-ID: <4C57650D.4070603@cs.cmu.edu> Lab, The system maintenance is complete. Any processes that were running before the maintenance window should pick up where they left off, automatically. There may be a need to additional maintenance windows in the coming days or weeks. I will try to stay sensitive to any deadlines, as long as the issues don't escalate to critical. Please let me know of any scheduling or deadlines that you have coming, so that I may stay in tune with all that is going on. Best, Mike From mjbaysek at cs.cmu.edu Wed Aug 4 10:29:36 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 04 Aug 2010 10:29:36 -0400 Subject: [auton-users] Auton System Status Message-ID: <4C597950.5050907@cs.cmu.edu> Hi Lab, The system is mostly restored. Some services I am still restoring, but primary services are available. Yesterday evening, the power in the SCS machine room was severed. I have been told that the outage occurred due to an incorrectly performed test on the new fire suppression system. The test was performed by an external inspector. The power interruption was long enough that all of the backup batteries went dead, and took the entire system (and Wean Hall 3611 SCS machine room) completely offline. That said, you should know that any and all compute jobs that were running have been terminated since every last server went down. It is likely that you will see (or have seen) a notice from SCS explaining the unexpected outage. Normally, I receive a call from SCS when any of our public facing machines go down. This time, however, I received no phone call, because they were in crisis mode themselves to get things like DNS, mail, and AFS restored. In crisis mode, their policy is not to deal at all with 'project servers' until the crisis is resolved. They are still here in droves in the machine room finishing up their recovery. Due to the entire machine room, including networking equipment losing the power, I received no notifications when anything went down. Additionally, I did not get my email until this morning, which is the first I had learned of the problem. As a result of this incident, I will be setting up extensive off site monitoring, so that even if all else fails, I will still be notified of issues in a timely manner. I recommend that everybody store my office number (412) 268-8939 and mobile number (412)-229-7356 in your cell phone. Please feel welcome to call me if you see any extended unannounced, or unexpected down times. Please don't call my mobile during off-hours for user requests and things that can wait until the next day - use email for that. Please understand that there are just too many of you, and I'd be getting calls all the time at home and while out. Again, system is 95% restored. The 5% that is still to be done is not likely to affect most of you. Cheers, Mike From mjbaysek at cs.cmu.edu Wed Aug 4 14:02:47 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 04 Aug 2010 14:02:47 -0400 Subject: [auton-users] System Trouble: Saga Continues Message-ID: <4C59AB47.7030903@cs.cmu.edu> /auton File server is down. No ETA. Will keep you informed. Mike From mjbaysek at cs.cmu.edu Wed Aug 4 16:33:21 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 04 Aug 2010 16:33:21 -0400 Subject: [auton-users] Auton System Status: Action Required Message-ID: <4C59CE91.2060208@cs.cmu.edu> The system will be down into late this evening as I pull the schizophrenic file server for offline diagnostics. I am building a secondary file server that will host the data until the primary server is behaving. You will receive status updates as the status changes. ********* Users, please mail me a list of all of the directories you are actively using on the system. I will migrate all current student and staff home directories by default. Alumni and past users: I will copy your data after all current user data is finished copying. All Users, If there are other directories, in places like /auton/public, /auton/userdirs or /auton/data, please mail me a list of those folders you need. The fastest way to a working system is if I selectively copy data first, and copy the rest (less active ) data later. ********* From mjbaysek at cs.cmu.edu Wed Aug 4 22:45:52 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 04 Aug 2010 22:45:52 -0400 Subject: [auton-users] Auton System Status Message-ID: <4C5A25E0.1090505@cs.cmu.edu> The system is again restored. You must reboot your /auton desktop in order to be able to work. There are a few other things you may care to know: A few facts about this afternoon (to now's) outage: ** The main file server (LYRE) that served /auton storage failed in a way that required it to be pulled offline. At the moment, the problem appears to be related to a strange interplay between the brand of disks and the storage controller. The instability seems to be hit or miss, depending on some unknown variable during boot. If you get a good boot, it seems to run stable, but 8 times out of 10, it doesn't. For the last 6 months, we have been riding on a 'good boot' but all of the reboots since Friday's crash have been 'bad reboots'. ** The outage was actually in no way related to the outage from last evening. It's more like a step brother to the outages on Friday and Monday. The outage on Monday was planned, and its intent was to prevent outages like today's, which it didn't. ** There have been numerous minor failures of this type since Friday. Today, the failure forcibly unmounted /auton from the live server, outside of my control. ** We could not continue with that server until the root cause of the problem is rectified, which it still is not. Solution: ** LOT2 has been commandeered to be used as the temporary file server. LOT2 is therefore unavailable for compute jobs. ** Only a partial data copy of /auton space has been done to the temporary server. Due to the controller malfunction in the other server, I am not comfortable copying data while the server is unattended, so the rest of the data will commence copying tomorrow when I arrive to office. All /auton data is intact, it's just not online yet. ** I have already copied all active accounts /auton/home/* directories to the temporary server. If I missed your account, please let me know. ** If you have data which I have not yet copied and that you need access to, you need to email me the approximate location of that data so it can be inserted into the 'priority list' for copying. I hope that this is the last time I have to write to you all, for a while, anyway. Mike From mjbaysek at cs.cmu.edu Thu Aug 5 16:34:47 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Thu, 05 Aug 2010 16:34:47 -0400 Subject: [auton-users] Autonwin and Autonwin64 Message-ID: <4C5B2067.3030506@cs.cmu.edu> Hi Lab, This is for those of you that use the two Windows instances autonwin, and autonwin64: These machines have been offline since the server switch. They will be restored tomorrow. Thanks, Mike From mjbaysek at cs.cmu.edu Fri Aug 6 12:54:36 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Fri, 06 Aug 2010 12:54:36 -0400 Subject: [auton-users] System Status Update: All Systems Go Message-ID: <4C5C3E4C.3020706@cs.cmu.edu> Hi Lab, The transition to the stand-in server is, by and large, complete. * autonwin, autonwin64, sdss (Liang's), and the PHIS TCWI are now all back up and running. * Large majority of home directories, even many past members are restored. The remaining directories are not going to be restored except by request. This is due to the limited storage of the stand-in server. If you need your directory and it has not already been restored, please contact me right away. * /auton/userdirs is restored. * /auton/data is restored with the exception of the following: medline and mimic2db's waveform_db subdirectory. This was again, an effort to keep a reasonable amount of free space. This data is stored safely, it's just too big to put online right now without some more wizardry. If these directories are needed now, or in the near future, please contact me right away. *** Please do all of your checking of things today, and let me know right away if you notice something awry. I am celebrating my wife's birthday out of town, and won't be at the computer much of the time. If you need to contact me, please mail for general support, or text me if the request is critical. 412-229-7356. Mike From mjbaysek at cs.cmu.edu Fri Aug 6 12:56:58 2010 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Fri, 06 Aug 2010 12:56:58 -0400 Subject: [auton-users] System Status Update: All Systems Go In-Reply-To: <4C5C3E4C.3020706@cs.cmu.edu> References: <4C5C3E4C.3020706@cs.cmu.edu> Message-ID: <4C5C3EDA.5070308@cs.cmu.edu> I neglected to mention that LOT2 will be unavailable until further notice. Please contact me when you need additional computing power beyond what is currently available. Be sure to check the status page at http://www.autonlab.org/status . This is now more important, since there are fewer CPUs available. Also, remember not to run any heavy jobs directly on LOP1. Mike On 08/06/2010 12:54 PM, Michael J. Baysek wrote: > Hi Lab, > > The transition to the stand-in server is, by and large, complete. > > * autonwin, autonwin64, sdss (Liang's), and the PHIS TCWI are now all > back up and running. > > * Large majority of home directories, even many past members are > restored. The remaining directories are not going to be restored > except by request. This is due to the limited storage of the stand-in > server. If you need your directory and it has not already been > restored, please contact me right away. > > * /auton/userdirs is restored. > > * /auton/data is restored with the exception of the following: > medline and mimic2db's waveform_db subdirectory. This was again, an > effort to keep a reasonable amount of free space. This data is stored > safely, it's just too big to put online right now without some more > wizardry. If these directories are needed now, or in the near future, > please contact me right away. > > *** Please do all of your checking of things today, and let me know > right away if you notice something awry. I am celebrating my wife's > birthday out of town, and won't be at the computer much of the time. > If you need to contact me, please mail for general support, or text me > if the request is critical. 412-229-7356. > > Mike From sabhnani+ at cs.cmu.edu Fri Aug 27 10:15:02 2010 From: sabhnani+ at cs.cmu.edu (Robin Sabhnani) Date: Fri, 27 Aug 2010 10:15:02 -0400 Subject: [auton-users] proposal talk. Message-ID: <4C77C866.6010807@cs.cmu.edu> Hi all, I am giving my thesis proposal talk this afternoon. You are welcome to attend it. See announcement below. #################### Date: 8/27/10 Time: 3:00pm Place: 4405 GHC PhD Candidate: Maheshkumar (Robin) Sabhnani Title: Disjunctive Anomaly Detection: Identifying Complex Anomalous Patterns Abstract: The problem of anomaly detection in multivariate time series data is common to many applications of practical interest. A few examples include network intrusion detection systems, manufacturing processes, climate studies, syndromic surveillance, video stream processing, etc. Our motivating application is syndromic surveillance that aims to detect potential disease outbreaks in pre-diagnosis data to facilitate timely public health response. To achieve this goal, efficient data structures and smart algorithms are needed to analyze highly multivariate temporal data. In this thesis work, we introduce Disjunctive Anomaly Detection (DAD),an algorithm for detecting complex anomalous clusters in multivariate datasets with categorical dimensions. Our proposed algorithm assumes that an anomalous cluster can affect any subset data dimensions (using conjunctions) and any subset of values (using disjunctions) along each data dimension. We believe that such a cluster definition is more informative of the real outbreaks as compared to the current approaches. In addition, the DAD algorithm models multiple anomalous clusters simultaneously, hence promising better detection power in the presence of multiple overlapping anomalous events. So far, we have compared DAD algorithm against the relevant powerful alternatives on two important tasks: finding sample-variable associations in cancer microarray data, and searching for the emerging disease outbreaks in public health data. Experimental results indicate that DAD is able to detect and explain complex anomalous clusters better than the alternative approaches such as the Large Average Submatrix (LAS) algorithm and the What's Strange About Recent Events (WSARE) algorithm. To assist in the development of future complex multidimensional and multivariate algorithms (including extensions to DAD),we also introduce the T-Cube data structure that efficiently represents any time series data with multiple categorical dimensions (typical in many fields of application including surveillance). The T-Cube data structure (inspired from AD-Trees for categorical count data) acts as a cache and quickly responds to any ad-hoc queries during an investigation. It enables processing of millions of time series during massive data mining operations.We have successfully applied T-Cube to mine interesting patterns in diverse projects involving temporal event data. Thesis Committee: Artur Dubrawski (Co-chair) Jeff Schneider (Co-chair) Aarti Singh Greg Cooper (University of Pittsburgh) From komarek.paul at gmail.com Fri Aug 27 11:48:46 2010 From: komarek.paul at gmail.com (Paul Komarek) Date: Fri, 27 Aug 2010 08:48:46 -0700 Subject: [auton-users] [Research] proposal talk. In-Reply-To: <4C77C866.6010807@cs.cmu.edu> References: <4C77C866.6010807@cs.cmu.edu> Message-ID: good luck Robin! On Fri, Aug 27, 2010 at 7:15 AM, Robin Sabhnani wrote: > Hi all, > > I am giving my thesis proposal talk this afternoon. You are welcome to > attend it. See announcement below. > > #################### > > Date: 8/27/10 > Time: 3:00pm > Place: 4405 GHC > > PhD Candidate: Maheshkumar (Robin) Sabhnani > > Title: Disjunctive Anomaly Detection: Identifying Complex Anomalous > Patterns > > Abstract: > > The problem of anomaly detection in multivariate time series data is > common to many applications of practical interest. A few examples > include network intrusion detection systems, manufacturing processes, > climate studies, syndromic surveillance, video stream processing, etc. > Our motivating application is syndromic surveillance that aims to detect > potential disease outbreaks in pre-diagnosis data to facilitate timely > public health response. To achieve this goal, efficient data structures > and smart algorithms are needed to analyze highly multivariate temporal > data. > > In this thesis work, we introduce Disjunctive Anomaly Detection (DAD),an > algorithm for detecting complex anomalous clusters in multivariate > datasets with categorical dimensions. Our proposed algorithm assumes > that an anomalous cluster can affect any subset data dimensions (using > conjunctions) and any subset of values (using disjunctions) along each > data dimension. We believe that such a cluster definition is more > informative of the real outbreaks as compared to the current approaches. > In addition, the DAD algorithm models multiple anomalous clusters > simultaneously, hence promising better detection power in the presence > of multiple overlapping anomalous events. So far, we have compared DAD > algorithm against the relevant powerful alternatives on two important > tasks: finding sample-variable associations in cancer microarray data, > and searching for the emerging disease outbreaks in public health data. > Experimental results indicate that DAD is able to detect and explain > complex anomalous clusters better than the alternative approaches such > as the Large Average Submatrix (LAS) algorithm and the What's Strange > About Recent Events (WSARE) algorithm. > > To assist in the development of future complex multidimensional and > multivariate algorithms (including extensions to DAD),we also introduce > the T-Cube data structure that efficiently represents any time series > data with multiple categorical dimensions (typical in many fields of > application including surveillance). The T-Cube data structure (inspired > from AD-Trees for categorical count data) acts as a cache and quickly > responds to any ad-hoc queries during an investigation. It enables > processing of millions of time series during massive data mining > operations.We have successfully applied T-Cube to mine interesting > patterns in diverse projects involving temporal event data. > > Thesis Committee: > Artur Dubrawski (Co-chair) > Jeff Schneider (Co-chair) > Aarti Singh > Greg Cooper (University of Pittsburgh) > _______________________________________________ > Research mailing list > Research at autonlab.org > https://www.autonlab.org/mailman/listinfo/research > From awd at cs.cmu.edu Fri Aug 27 11:49:41 2010 From: awd at cs.cmu.edu (Artur Dubrawski) Date: Fri, 27 Aug 2010 11:49:41 -0400 Subject: [auton-users] [Research] proposal talk. In-Reply-To: References: <4C77C866.6010807@cs.cmu.edu> Message-ID: <4C77DE95.9020009@cs.cmu.edu> You're not coming Paul??? On 8/27/2010 11:48 AM, Paul Komarek wrote: > good luck Robin! > > On Fri, Aug 27, 2010 at 7:15 AM, Robin Sabhnani wrote: >> Hi all, >> >> I am giving my thesis proposal talk this afternoon. You are welcome to >> attend it. See announcement below. >> >> #################### >> >> Date: 8/27/10 >> Time: 3:00pm >> Place: 4405 GHC >> >> PhD Candidate: Maheshkumar (Robin) Sabhnani >> >> Title: Disjunctive Anomaly Detection: Identifying Complex Anomalous >> Patterns >> >> Abstract: >> >> The problem of anomaly detection in multivariate time series data is >> common to many applications of practical interest. A few examples >> include network intrusion detection systems, manufacturing processes, >> climate studies, syndromic surveillance, video stream processing, etc. >> Our motivating application is syndromic surveillance that aims to detect >> potential disease outbreaks in pre-diagnosis data to facilitate timely >> public health response. To achieve this goal, efficient data structures >> and smart algorithms are needed to analyze highly multivariate temporal >> data. >> >> In this thesis work, we introduce Disjunctive Anomaly Detection (DAD),an >> algorithm for detecting complex anomalous clusters in multivariate >> datasets with categorical dimensions. Our proposed algorithm assumes >> that an anomalous cluster can affect any subset data dimensions (using >> conjunctions) and any subset of values (using disjunctions) along each >> data dimension. We believe that such a cluster definition is more >> informative of the real outbreaks as compared to the current approaches. >> In addition, the DAD algorithm models multiple anomalous clusters >> simultaneously, hence promising better detection power in the presence >> of multiple overlapping anomalous events. So far, we have compared DAD >> algorithm against the relevant powerful alternatives on two important >> tasks: finding sample-variable associations in cancer microarray data, >> and searching for the emerging disease outbreaks in public health data. >> Experimental results indicate that DAD is able to detect and explain >> complex anomalous clusters better than the alternative approaches such >> as the Large Average Submatrix (LAS) algorithm and the What's Strange >> About Recent Events (WSARE) algorithm. >> >> To assist in the development of future complex multidimensional and >> multivariate algorithms (including extensions to DAD),we also introduce >> the T-Cube data structure that efficiently represents any time series >> data with multiple categorical dimensions (typical in many fields of >> application including surveillance). The T-Cube data structure (inspired >> from AD-Trees for categorical count data) acts as a cache and quickly >> responds to any ad-hoc queries during an investigation. It enables >> processing of millions of time series during massive data mining >> operations.We have successfully applied T-Cube to mine interesting >> patterns in diverse projects involving temporal event data. >> >> Thesis Committee: >> Artur Dubrawski (Co-chair) >> Jeff Schneider (Co-chair) >> Aarti Singh >> Greg Cooper (University of Pittsburgh) >> _______________________________________________ >> Research mailing list >> Research at autonlab.org >> https://www.autonlab.org/mailman/listinfo/research >> > From komarek.paul at gmail.com Fri Aug 27 11:50:27 2010 From: komarek.paul at gmail.com (Paul Komarek) Date: Fri, 27 Aug 2010 08:50:27 -0700 Subject: [auton-users] [Research] proposal talk. In-Reply-To: <4C77DE95.9020009@cs.cmu.edu> References: <4C77C866.6010807@cs.cmu.edu> <4C77DE95.9020009@cs.cmu.edu> Message-ID: I have a nail appointment that day. On Fri, Aug 27, 2010 at 8:49 AM, Artur Dubrawski wrote: > You're not coming Paul??? > > > On 8/27/2010 11:48 AM, Paul Komarek wrote: >> >> good luck Robin! >> >> On Fri, Aug 27, 2010 at 7:15 AM, Robin Sabhnani >> ?wrote: >>> >>> Hi all, >>> >>> I am giving my thesis proposal talk this afternoon. You are welcome to >>> attend it. See announcement below. >>> >>> #################### >>> >>> Date: 8/27/10 >>> Time: 3:00pm >>> Place: 4405 GHC >>> >>> PhD Candidate: Maheshkumar (Robin) Sabhnani >>> >>> Title: Disjunctive Anomaly Detection: Identifying Complex Anomalous >>> Patterns >>> >>> Abstract: >>> >>> The problem of anomaly detection in multivariate time series data is >>> common to many applications of practical interest. A few examples >>> include network intrusion detection systems, manufacturing processes, >>> climate studies, syndromic surveillance, video stream processing, etc. >>> Our motivating application is syndromic surveillance that aims to detect >>> potential disease outbreaks in pre-diagnosis data to facilitate timely >>> public health response. To achieve this goal, efficient data structures >>> and smart algorithms are needed to analyze highly multivariate temporal >>> data. >>> >>> In this thesis work, we introduce Disjunctive Anomaly Detection (DAD),an >>> algorithm for detecting complex anomalous clusters in multivariate >>> datasets with categorical dimensions. Our proposed algorithm assumes >>> that an anomalous cluster can affect any subset data dimensions (using >>> conjunctions) and any subset of values (using disjunctions) along each >>> data dimension. We believe that such a cluster definition is more >>> informative of the real outbreaks as compared to the current approaches. >>> In addition, the DAD algorithm models multiple anomalous clusters >>> simultaneously, hence promising better detection power in the presence >>> of multiple overlapping anomalous events. So far, we have compared DAD >>> algorithm against the relevant powerful alternatives on two important >>> tasks: finding sample-variable associations in cancer microarray data, >>> and searching for the emerging disease outbreaks in public health data. >>> Experimental results indicate that DAD is able to detect and explain >>> complex anomalous clusters better than the alternative approaches such >>> as the Large Average Submatrix (LAS) algorithm and the What's Strange >>> About Recent Events (WSARE) algorithm. >>> >>> To assist in the development of future complex multidimensional and >>> multivariate algorithms (including extensions to DAD),we also introduce >>> the T-Cube data structure that efficiently represents any time series >>> data with multiple categorical dimensions (typical in many fields of >>> application including surveillance). The T-Cube data structure (inspired >>> from AD-Trees for categorical count data) acts as a cache and quickly >>> responds to any ad-hoc queries during an investigation. It enables >>> processing of millions of time series during massive data mining >>> operations.We have successfully applied T-Cube to mine interesting >>> patterns in diverse projects involving temporal event data. >>> >>> Thesis Committee: >>> Artur Dubrawski (Co-chair) >>> Jeff Schneider (Co-chair) >>> Aarti Singh >>> Greg Cooper (University of Pittsburgh) >>> _______________________________________________ >>> Research mailing list >>> Research at autonlab.org >>> https://www.autonlab.org/mailman/listinfo/research >>> >> >