From donghanw at cs.cmu.edu Sat Jun 8 18:53:39 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Sat, 8 Jun 2013 18:53:39 -0400 Subject: [auton-users] low1 restored Message-ID: Dear Auton users, Low1, compute node, has been rebooted due to system overload. All jobs were terminated. All services on the node are up and running now. Please check your jobs. Date/Time --------------- Overloaded since 06/08 around 2:30 PM Rebooted on 06/08 6:50 AM Description ---------------- A user job exhausted the memory and caused the kernel crash. Please let me know if you have any questions/concerns. Thanks, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: From donghanw at cs.cmu.edu Tue Jun 11 12:32:32 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Tue, 11 Jun 2013 12:32:32 -0400 Subject: [auton-users] Compute-0-0 restored Message-ID: Dear Auton users, Compute-0-0, compute node, was rebooted unexpectedly due to kernel crash. All jobs were terminated. All services on the node are up and running now. Please check your jobs. Date/Time --------------- Down since 06/11 around 8:30 AM Rebooted on 06/11 12:29 PM Description ---------------- The exact cause for the crash is unclear. Further investigation will be conducted. Please let me know if you have any questions. Thanks for your attention. Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: From donghanw at cs.cmu.edu Thu Jun 27 13:08:10 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Thu, 27 Jun 2013 13:08:10 -0400 Subject: [auton-users] Lou1 restored Message-ID: Dear Auton users, Lou1, compute node, has been rebooted due to system overload. All jobs were terminated. All services on the node are up and running now. Please check your jobs. Date/Time --------------- Overloaded since 06/26 around 2:30 AM Rebooted on 06/27 12:30 PM Description ---------------- A user job exhausted the memory and caused the kernel crash. Please let me know if you have any questions/concerns. Thanks, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: From donghanw at cs.cmu.edu Fri Jun 28 13:36:33 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Fri, 28 Jun 2013 13:36:33 -0400 Subject: [auton-users] Hadoop cluster Message-ID: Dear Auton users, I'm pleased to announce the availability of Auton cloud computing service, a Hadoop cluster. The cluster consists of five nodes: - gd1 - compute-0-0 - compute-0-1 - compute-0-2 - compute-0-3 HDFS ===== Your home directory is at /user/ in the HDFS (31TB available). There is no backup for the time being. Tutorial ====== To run mapreduce jobs on the cluster, ssh to any of the above nodes. Here is a short hadoop tutorial: http://www.cloudera.com/content/cloudera-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial.html You'll need to forward corresponding port to your local compute to view monitoring sites. Auton desktop users(Linux) can view them directly. Monitoring sites ============ - Job Tracker: http://gd1.int.autonlab.org:50030/jobtracker.jsp - HDFS Namenode: http://gd1.int.autonlab.org:50070/dfshealth.jsp Services ======= At this point, the cluster provides two services: - HDFS - MapReduce Other services such as HBase, Hue will be added soon. Best, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: