From mjbaysek at cs.cmu.edu  Wed Dec 22 11:43:37 2010
From: mjbaysek at cs.cmu.edu (Michael J. Baysek)
Date: Wed, 22 Dec 2010 11:43:37 -0500
Subject: [auton-users] Auton Downtime Scheduled
Message-ID: <4D122AB9.10106@cs.cmu.edu>

Hi Lab,

All Auton Lab computing resources will be down for maintenance on Wed 
Dec 29, from 7:00 AM to 7:00 PM.   This outage applies also to Neill Lab 
users.

Services affected:  All.  Any processes, jobs, or screen sessions that 
are still running on compute nodes at the time of the scheduled outage 
will be terminated.

Lab supported desktops will function by request only.  Auton Lab staff, 
you must notify me by Tuesday if you need access to your machine on 
Wednesday.

Happy Holidays Everyone,

Mike


From mjbaysek at cs.cmu.edu  Tue Dec 28 12:12:32 2010
From: mjbaysek at cs.cmu.edu (Michael J. Baysek)
Date: Tue, 28 Dec 2010 12:12:32 -0500
Subject: [auton-users] Auton Downtime Scheduled
Message-ID: <4D1A1A80.1020505@cs.cmu.edu>

Hi Lab,

This is a reminder about the downtime tomorrow.  Please be sure to let 
me know in the next 12 hours if you will require access to your lab 
supported desktop machine during the outage window.

Mike

Original Notification Follows:

All Auton Lab computing resources will be down for maintenance on Wed 
Dec 29, from 7:00 AM to 7:00 PM.   This outage applies also to Neill Lab 
users.

Services affected:  All.  Any processes, jobs, or screen sessions that 
are still running on compute nodes at the time of the scheduled outage 
will be terminated.

Lab supported desktops will function by request only.  Auton Lab staff, 
you must notify me by Tuesday if you need access to your machine on 
Wednesday.

Happy Holidays Everyone,

Mike


From mjbaysek at cs.cmu.edu  Wed Dec 29 22:26:02 2010
From: mjbaysek at cs.cmu.edu (Michael J. Baysek)
Date: Wed, 29 Dec 2010 22:26:02 -0500
Subject: [auton-users] Auton Downtime Status
Message-ID: <4D1BFBCA.70505@cs.cmu.edu>

Hi Lab.

Status report (in short): Things are back up... with caveats.

More info:

I was not able to get started until after 11, due to some file copying 
taking longer than estimated (way to many small files).  My original 
estimate of 12 hours was about right, but due to the late start, the 
outage window lasted 3.5 hours later than expected.

There were many ambitious goals undertaken for this outage window.  Not 
all of them were completed.  This is the downside of being an only one 
person I.T. department.  It turned out taking much longer to 
measure/redistribute power requirements/rewire than expected.  Now that 
this is completed, we are ready for the new hadoop cluster and other 
future growth.  I'd like to thank Robin for giving me a very helpful 
hand in this very boring and time consuming task.

Now the less happy news.  Yet to be completed is the switchover to the 
new file server.  This was a big goal that I wanted to get done this 
time, but it's not happening.  We'll need to suffer with the slow write 
performance until next maintenance window (TBA).  It will be within the 
next month.

Additionally, Neill1 is down and pending further troubleshooting.   It 
should be back up before midweek next week.

Mike

Happy New Year :)