[auton-users] Auton Cluster Maintenance on Monday
Michael J. Baysek
mjbaysek at cs.cmu.edu
Fri Jul 8 11:38:07 EDT 2011
Hi Lab,
On this coming Monday, July 10th starting at 8:00 AM, I will be
taking advantage of the relatively quiet system activity and will
be shutting down the Auton central compute system in order to
perform system maintenance. This work will set us up with the
extra storage and performance we will need in the coming months
and years.
The maintenance includes:
* Switch File server to Scientific Linux 6 OS, a RHEL clone.
* Switch Linux file services to NFS4 file protocol.
* Switch to larger disk array.
* Improvements in file performance over NFS.
* Software upgrade of primary Firewall server, time permitting.
During this maintenance, many services will be unavailable.
* /auton space, by all access methods.
* All compute nodes.
* CVS and Subversion.
* Network copies of Eclipse, Netbeans, Matlab, etc.
* MySQL server on LYRE.
* ViewVC.
These services are among those that will be largely unaffected, but
they may be restarted or briefly interrupted throughout the course
of the maintenance:
* TCWI instances.
* LOT1 (Project Server).
* GD1 (Project Server).
* Bugzilla.
* SDSS.
I plan to begin the maintenance at 8 AM. The work will last into
the afternoon.
For Lab Staff with desktop workstations:
This work requires me to forcibly unmount /auton from all machines
that mount it, so it's best to log out of your workstation on Friday
(Today) when you leave the office. If you must leave processes
running, any processes that have open file handles to /auton space
will hang forever. They will not 'pick up where they left off' when
the server comes online, like they usually would during a server
outage. It would be best to keep any processes accessing files only
on your local disk.
If you have any questions regarding the planned outage, or a specific
service, please drop me a mail.
Mike
P.S. /auton will need to be force unmounted from the NEILL system as
a result of this work. This maintenance should not affect the NEILL
system *unless you are accessing files from /auton*. If you are, those
processes will hang indefinitely.
--
Michael J. Baysek
Systems Analyst
Carnegie Mellon University / Auton Lab
412-268-8939 - mjbaysek at cs.cmu.edu
http://www.autonlab.org
More information about the Autonlab-users
mailing list