From donghanw at cs.cmu.edu Fri Apr 5 09:48:31 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Fri, 5 Apr 2013 09:48:31 -0400 Subject: [auton-users] Lot2 restored Message-ID: Dear Auton users, Lot2, compute node, has been rebooted due to system overload. All jobs were terminated. All services on the node are up and running now. Date/Time --------------- Overloaded since 04/04 around 9:20 PM Rebooted on 04/05 9:30 AM Description ---------------- A user job exhausted the memory. As a result, the node stopped responding to the world. Please let me know if you have any questions/concerns. Thanks, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: From donghanw at cs.cmu.edu Wed Apr 10 08:23:59 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Wed, 10 Apr 2013 08:23:59 -0400 Subject: [auton-users] Rebooting low1, 04/10, 2:30 PM Message-ID: Dear Auton users, Low1, compute node, is scheduled for a reboot on 04/10, at 2:30 PM, to clean up blocked IO caused by some user programs. As a result of the blocked IO, normal users are not able to log into low1 for the time being. According to the system report, I notice a running job and will contact the user. If you believe you have running jobs or anything you want to get my attention, please let me know at your earliest convince. Thanks, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: From donghanw at cs.cmu.edu Wed Apr 10 15:10:26 2013 From: donghanw at cs.cmu.edu (Donghan (Jarod) Wang) Date: Wed, 10 Apr 2013 15:10:26 -0400 Subject: [auton-users] low1 rebooted Message-ID: Dear Auton users, Low1 is back online. All services are up and running. Thanks, Jarod -------------- next part -------------- An HTML attachment was scrubbed... URL: