From mjbaysek at cs.cmu.edu Mon Aug 3 11:15:12 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 03 Aug 2009 11:15:12 -0400 Subject: [auton-users] CPU Upgrades and Request from Sysadmin: PLEASE RESPOND Message-ID: <4A76FF00.4090605@cs.cmu.edu> Hi Lab, I have great news. We are going to be upgrading our computing machines to much faster processors. We have tested this upgrade on LOR2, and we saw a test run of AFDL go from taking 12 hours to 7 hours. In order to make sure each machine is built with the proper revision of hardware, I am going to need to power down and remove each and every LOP, LOQ, and LOR machine from the rack for a few moments so I can visibly inspect the system boards. Any processes that are running will need to be stopped or administratively killed before I can proceed. If you are a current active user of the lab, please get back to me by WEDNESDAY if you have (or plan to have) any crucial processes running anywhere on LOP/LOQ/LORS for the upcoming weeks. Please specify your time frame start and end, if you expect to have any processes running for > 24 hours. I prefer to do this check on a weekend, and all in one shot if possible, which basically means a couple of hours of complete downtime for the machines in question. Thanks. Please try to respond by Wednesday, so I can formulate the plans. Best, -- -- Michael J. Baysek, Systems Analyst Carnegie Mellon University - Auton Lab www.cmu.edu - www.autonlab.org 412-268-8939 From mjbaysek at cs.cmu.edu Fri Aug 7 14:42:14 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Fri, 07 Aug 2009 14:42:14 -0400 Subject: [auton-users] LOP*, LOQ*, LOR* DOWNTIME THIS WEEKEND In-Reply-To: <4A76FF00.4090605@cs.cmu.edu> References: <4A76FF00.4090605@cs.cmu.edu> Message-ID: <4A7C7586.3020004@cs.cmu.edu> I have heard from no one regarding any critical jobs that need to be run this weekend. Therefore, THIS WEEKEND, I will bring down all of the following machines in order to qualify them for the CPU upgrade. LOP 1, 2, 3, 4, 5 and 6 LOQ 1, 2, 3 LOR 1 The outage will not affect any other machines, including neill* machines, website, WEBCVS/SVN or any lab infrastructure. CVS will not be accessible to most of you during the downtime, since most of you tunnel through LOP1. Your home directories will still be accessible on your Linux desktops, or by your mapped drives in Windows if you are set up for access this way. At this point, I do not feel pressed to commit to a precise time that this will happen, but expect that it will be sometime between 6am Saturday and 9am Monday. I will send an email to all of you when the process is complete. The process should take less than 2 hours. Any jobs that are running when I bring the system down will be killed when I bring the machines down. Feel free to run whatever you want between now and then, but know that if it isn't finished when I do the shutdown, you'll have to start it again Monday. There will be another outage for the machines coming up when I install the new CPU's. I will, of course, keep you all informed. -- Michael J. Baysek, Systems Analyst Carnegie Mellon University - Auton Lab www.cmu.edu - www.autonlab.org 412-268-8939 From mjbaysek at cs.cmu.edu Mon Aug 10 07:47:08 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 10 Aug 2009 07:47:08 -0400 Subject: [auton-users] Compute Machine Downtime Completed Message-ID: <4A8008BC.9030102@cs.cmu.edu> The downtime has been completed. There will be at least one downtime in the coming weeks to accomodate the upgrades and testing. Please let me know if you plan to schedule any jobs lasting more than 2-3 days in the coming weeks, so I can plan for them, and keep the impact of any downtime to a minimum. -Mike From mjbaysek at cs.cmu.edu Mon Aug 10 16:56:49 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 10 Aug 2009 16:56:49 -0400 Subject: [auton-users] Compute Machine Downtime Completed In-Reply-To: <4A8008BC.9030102@cs.cmu.edu> References: <4A8008BC.9030102@cs.cmu.edu> Message-ID: <4A808991.2090104@cs.cmu.edu> There were at least two people who asked which machines had the new CPU's installed. No CPU's were installed during the downtime. The downtime was intended only to qualify the machines for upgrade, not actually upgrade them. Some further testing will be needed to identify the maximum upgrade for most of the machines, since unfortunately, they had an older revision of the motherboard than I had hoped. You will be notified before any further downtime is scheduled. -- Michael J. Baysek, Systems Analyst Carnegie Mellon University - Auton Lab www.cmu.edu - www.autonlab.org 412-268-8939 Michael J. Baysek wrote, On 08/10/2009 07:47 AM: > The downtime has been completed. There will be at least one downtime > in the coming weeks to accomodate the upgrades and testing. > > Please let me know if you plan to schedule any jobs lasting more than > 2-3 days in the coming weeks, so I can plan for them, and keep the > impact of any downtime to a minimum. > > -Mike From mjbaysek at cs.cmu.edu Fri Aug 14 13:15:19 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Fri, 14 Aug 2009 13:15:19 -0400 Subject: [auton-users] Sysadmin on Vacation Aug 21-30. Message-ID: <4A859BA7.4050406@cs.cmu.edu> Hi Lab, I'll be out of town starting next Thursday evening, Aug 20 and will not return to the office until Aug 31. I will be in North Carolina for a wedding, and a week of hopefully good weather :) Please let me know *soon* if you anticipate any special system or project requests during this time. I would very much like to make sure everyone is taken care of in advance of my absence, if possible. I'll have net access while I am away, but I will not be spending copious amounts of time at the computer. Please expect a latency of 24 hours if you have any non-urgent requests. If you do have urgent matter, please use the contact information page at http://www.autonlab.org/auton_intranet/19092.html in order to contact me. This is where I will post the most current means of contacting me, such as the number at the hotel, when I get it. Mike