[auton-users] Auton Downtime Status

Michael J. Baysek mjbaysek at cs.cmu.edu
Wed Dec 29 22:26:02 EST 2010


Hi Lab.

Status report (in short): Things are back up... with caveats.

More info:

I was not able to get started until after 11, due to some file copying 
taking longer than estimated (way to many small files).  My original 
estimate of 12 hours was about right, but due to the late start, the 
outage window lasted 3.5 hours later than expected.

There were many ambitious goals undertaken for this outage window.  Not 
all of them were completed.  This is the downside of being an only one 
person I.T. department.  It turned out taking much longer to 
measure/redistribute power requirements/rewire than expected.  Now that 
this is completed, we are ready for the new hadoop cluster and other 
future growth.  I'd like to thank Robin for giving me a very helpful 
hand in this very boring and time consuming task.

Now the less happy news.  Yet to be completed is the switchover to the 
new file server.  This was a big goal that I wanted to get done this 
time, but it's not happening.  We'll need to suffer with the slow write 
performance until next maintenance window (TBA).  It will be within the 
next month.

Additionally, Neill1 is down and pending further troubleshooting.   It 
should be back up before midweek next week.

Mike

Happy New Year :)



More information about the Autonlab-users mailing list