From mjbaysek at cs.cmu.edu Tue Dec 8 07:24:48 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Tue, 08 Dec 2009 07:24:48 -0500 Subject: [auton-users] LOQs Message-ID: <4B1E4590.5010705@cs.cmu.edu> Would whoever keeps bringing down LOQ machines, please email me? Thanks, Mike From mjbaysek at cs.cmu.edu Tue Dec 8 16:40:23 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Tue, 08 Dec 2009 16:40:23 -0500 Subject: [auton-users] LOQs In-Reply-To: <4B1E4590.5010705@cs.cmu.edu> References: <4B1E4590.5010705@cs.cmu.edu> Message-ID: <4B1EC7C7.1030609@cs.cmu.edu> Anyone ? -- Michael J. Baysek, Systems Analyst Carnegie Mellon University - Auton Lab www.cmu.edu - www.autonlab.org 412-268-8939 Michael Baysek wrote, On 12/08/2009 07:24 AM: > Would whoever keeps bringing down LOQ machines, please email me? > > Thanks, > > Mike From mjbaysek at cs.cmu.edu Tue Dec 15 17:22:17 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Tue, 15 Dec 2009 17:22:17 -0500 Subject: [auton-users] Holiday Downtime Message-ID: <4B280C19.2040206@cs.cmu.edu> I need to plan for one to two days of complete downtime while I install the new file server and tend to various other housekeeping chores in the server rack. Are there any days over (or around) the holiday break that anyone CANNOT afford downtime? I'm collecting this information to help determine the best window to complete the work. - Mike P.S. You might ask, what do I mean by complete downtime? This means no Auton Lab services, desktops, webservers, CVS, etc are likely to be running. :) From mjbaysek at cs.cmu.edu Mon Dec 21 16:07:06 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Mon, 21 Dec 2009 16:07:06 -0500 Subject: [auton-users] Changes to Auton Lab DNS Message-ID: <4B2FE37A.1030107@cs.cmu.edu> Hi Lab, We have long had a problem which I will refer to as "DNS duality" where if accessed interally, *.autonlab.org addresses resolve to different IP addresses than they would if accessed externally. This is a carryover from the legacy infrastructure from days of yore. This DNS duality creates a number of problems for standardizing configurations. What follows is a description and brief rational for some changes that are underway regarding DNS. You may or may not care to read it, but I prefer to broadcast information instead of secretly making changes and causing problems that seem to have no explanation. I admit that this changeover may cause some issues, though I will do my best to minimize them. This is the first in a series of changes that are going to make things less trouble-prone in the future. The nature of the change revolves around the fact that we have two (even three) networks we can access our central servers from, the internal 192 network, CMU's network, and the world. Depending on which mode of access, a machine could be using either the internal Auton DNS servers, or honest authoritative Internet DNS servers. When the central servers talk to each other, they are doing so over a private (192.168) network. When DNS queries were made for *.autonlab.org hosts, the internal DNS servers dished up the internal 192.168.x.x. address of the server in question. These addresses do not match the "real" or external public IP addresses that Internet DNS servers return for these hosts. In effect, DNS duality. To make matters more convoluted, On VPN connected workstations, depending on how the domain servers and search paths were set, some machines were getting the internal addresses when they queried for *.autonlab.org hosts, and some were getting the public 128.2 addresses. Traffic going to the 192 addresses are transferred over the VPN, incurring overhead for the encryption of the VPN connection, and adding load to the primary firewall machine in our server rack. Depending on whether a machine was using internal DNS or external DNS also affected which hostnames would even resolve. Also, requests coming from inside (192) vs. outside IP addresses are treated differently by the firewall, webservers, etc, sometimes leading one group of users to not be able to access a resource when they should. To solve this (and other problems related to DNS duality of the autonlab.org DNS), I am introducing the int.autonlab.org domain. When machines are supposed to access internal IP's by default, they will be configured with the int.autonlab.org search suffix. The search suffix is what allows you to say "ssh lop1" instead of "ssh lop1.auton. etc.) Fixing the DNS duality is one step in fixing "once and for all" some nagging issues that has caused lots of wasted sysadmin time over the years. Your comments appreciated, Mike From komarek.paul at gmail.com Mon Dec 21 17:14:03 2009 From: komarek.paul at gmail.com (Paul Komarek) Date: Mon, 21 Dec 2009 14:14:03 -0800 Subject: [auton-users] Changes to Auton Lab DNS In-Reply-To: <4B2FE37A.1030107@cs.cmu.edu> References: <4B2FE37A.1030107@cs.cmu.edu> Message-ID: <459f38470912211414q1a5bbb7cjd6a9b79814dc97a3@mail.gmail.com> My comment: woohoo, and good luck. It will be good to see this old setup die. On Mon, Dec 21, 2009 at 1:07 PM, Michael J. Baysek wrote: > Hi Lab, > > > We have long had a problem which I will refer to as "DNS duality" where if > accessed interally, *.autonlab.org addresses resolve to different IP > addresses than they would if accessed externally. ?This is a carryover from > the legacy infrastructure from days of yore. ?This DNS duality creates a > number of problems for standardizing configurations. > > > What follows is a description and brief rational for some changes that are > underway regarding DNS. ?You may or may not care to read it, but I prefer to > broadcast information instead of secretly making changes and causing > problems that seem to have no explanation. ?I admit that this changeover may > cause some issues, though I will do my best to minimize them. ?This is the > first in a series of changes that are going to make things less > trouble-prone in the future. > > > The nature of the change revolves around the fact that we have two (even > three) networks we can access our central servers from, the internal 192 > network, CMU's network, and the world. ?Depending on which mode of access, a > machine could be using either the internal Auton DNS servers, or honest > authoritative Internet DNS servers. > > > When the central servers talk to each other, they are doing so over a > private (192.168) network. ?When DNS queries were made for *.autonlab.org > hosts, the internal DNS servers dished up the internal 192.168.x.x. address > of the server in question. ?These addresses do not match the "real" or > external public IP addresses that Internet DNS servers return for these > hosts. ?In effect, DNS duality. > > > To make matters more convoluted, On VPN connected workstations, depending on > how the domain servers and search paths were set, some machines were getting > the internal addresses when they queried for *.autonlab.org hosts, and some > were getting the public 128.2 addresses. ?Traffic going to the 192 addresses > are transferred over the VPN, incurring overhead for the encryption of the > VPN connection, and adding load to the primary firewall machine in our > server rack. > > > Depending on whether a machine was using internal DNS or external DNS also > affected which hostnames would even resolve. ?Also, requests coming from > inside (192) vs. outside IP addresses are treated differently by the > firewall, webservers, etc, sometimes leading one group of users to not be > able to access a resource when they should. > > > To solve this (and other problems related to DNS duality of the autonlab.org > DNS), I am introducing the int.autonlab.org domain. ?When machines are > supposed to access internal IP's by default, they will be configured with > the int.autonlab.org search suffix. ?The search suffix is what allows you to > say "ssh lop1" instead of "ssh lop1.auton. etc.) > > Fixing the DNS duality is one step in fixing "once and for all" some nagging > issues that has caused lots of wasted sysadmin time over the years. > > > Your comments appreciated, > > > Mike > From mjbaysek at cs.cmu.edu Wed Dec 23 15:31:49 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 23 Dec 2009 15:31:49 -0500 Subject: [auton-users] NOTICE: Lab Downtime Monday Dec 28. Message-ID: <4B327E35.908@cs.cmu.edu> Hi Lab. Please read this email in its entirety. If you have any questions or concerns, you need to let me know as soon as possible. All Auton Lab systems will be offline on Monday Dec 28 beginning at 8:00 AM. This includes web servers, file servers, TCube Web Interface instances, CVS, Subversion, VPN, and Linux desktops, and pretty much everything else I haven't mentioned. I expect that by 5:00 PM, most services will again be available. When the system comes back up, we will have a new fileserver. The mount points for home and data directories will be changing. BigPapa will be going away. If you currently access "/mnt/BigPapa" from a Linux box named gs*.sp.cs.cmu.edu, you will need to email me and request the new mount points be setup correctly on your machine. If you are using a machine in the .auton.cs.cmu.edu or .autonlab.org domains, I will make all necessary changes for you. Additionally, the internal IP addressing scheme will be changing. For years, we have used 192.168.1.x as our internal IP range. This has limited our ability to work offsite over VPN in a number of ways since most home routers use this address range. This longstanding issue will also be resolved during the downtime. OUTAGE: Monday December 28 WHEN: 8:00 AM to 5:00 PM (for critical services) Mike P.S. I recommend if you need to keep working during the outage, to copy whatever you are working on locally and work on it that way. Keep in mind, you won't be able to log into any BigPapa connected desktops at all, so don't copy it there! From mjbaysek at cs.cmu.edu Wed Dec 23 16:22:42 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 23 Dec 2009 16:22:42 -0500 Subject: [auton-users] NOTICE: Lab Downtime Monday Dec 28. In-Reply-To: References: <4B327E35.908@cs.cmu.edu> Message-ID: <4B328A22.7070608@cs.cmu.edu> Daniel, (and CC to list in case others had same questions) 1) Neill* machine availability Yes. All servers including neill* machines will be down during this time. The very core of the entire infrastructure is being refactored. Neill* machines do rely on some Auton infrastructure, such as routing and networking so all machines are affected. 2) BigPapa going away The intention is to get completely away from using /mnt/BigPapa mount point at all. I anticipated that scripts would be referencing it directly. I considered the use of symlinks, but that will only serve foster it's survival, so at this point I am against it. Users could fix the scripts themselves, but instead, I plan to create a script to crawl the file space for scripts referencing BigPapa and fix them to point to the new locations. Nobody should have to do anything once this script is run unless they have BigPapa hardcoded into C code. As far as the mount point change, all Auton resources will be mounted from /auton from now on. Documentation is upcoming on this change, and will be released before the outage begins. Specific to you and your students, I wasn't planning on making any changes to the neill-fs mountpoints at this time. Mike Daniel B. Neill wrote: > Hi Mike, could you please clarify the following: > > 1) Will the neill1-neill4 servers be going down as well? > > 2) After the downtime, how can we access data that was previously > available in the /mnt/BigPapa directory? (Can we keep a symbolic link > around so that we don't have to change all our scripts?) > > Thanks! > Daniel > > > On Wed, 23 Dec 2009, Michael J. Baysek wrote: > >> Hi Lab. >> >> >> Please read this email in its entirety. If you have any questions or >> concerns, you need to let me know as soon as possible. >> >> >> All Auton Lab systems will be offline on Monday Dec 28 beginning at >> 8:00 AM. This includes web servers, file servers, TCube Web Interface >> instances, CVS, Subversion, VPN, and Linux desktops, and pretty much >> everything else I haven't mentioned. I expect that by 5:00 PM, most >> services will again be available. >> When the system comes back up, we will have a new fileserver. The >> mount points for home and data directories will be changing. BigPapa >> will be going away. If you currently access "/mnt/BigPapa" from a >> Linux box named gs*.sp.cs.cmu.edu, you will need to email me and >> request the new mount points be setup correctly on your machine. If >> you are using a machine in the .auton.cs.cmu.edu or .autonlab.org >> domains, I will make all necessary changes for you. >> >> >> Additionally, the internal IP addressing scheme will be changing. >> For years, we have used 192.168.1.x as our internal IP range. This >> has limited our ability to work offsite over VPN in a number of ways >> since most home routers use this address range. This longstanding >> issue will also be resolved during the downtime. >> >> >> OUTAGE: Monday December 28 >> WHEN: 8:00 AM to 5:00 PM (for critical services) >> >> >> Mike >> >> >> P.S. I recommend if you need to keep working during the outage, to >> copy whatever you are working on locally and work on it that way. >> Keep in mind, you won't be able to log into any BigPapa connected >> desktops at all, so don't copy it there! >> >> >> >> >> >> >> >> > From mjbaysek at cs.cmu.edu Wed Dec 23 16:31:17 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 23 Dec 2009 16:31:17 -0500 Subject: [auton-users] NOTICE: Lab Downtime Monday Dec 28. In-Reply-To: <4B328A22.7070608@cs.cmu.edu> References: <4B327E35.908@cs.cmu.edu> <4B328A22.7070608@cs.cmu.edu> Message-ID: <4B328C25.9010301@cs.cmu.edu> Another question I am bound to get is: What about CVS? Isn't that on BigPapa? Yes. CVSROOTs will need to be changed. We've already done this before when we moved CVS off of AFS back in 2005 or 2006. In this case, moving of CVS is going to serve two purposes. We are moving toward making sure that all CVS access is performed over SSH tunnel. This is going to allow us to do smart things with CVS that we couldn't do before... That is, if we don't move to a better RCS like Subversion. There is a script which changes the CVSROOT of a CVS sandbox. I will run this script on all network home directories. Any h/ directories on laptops or hard drives elsewhere will need to have the script run manually. Documentation on this issue is forthcoming. There are many of these type of issues which are anticipated. Most of them already are already planned for. The idea is to make this as smooth a transition as possible. If anyone has any other questions, please do not be afraid to ask! Mike Michael J. Baysek wrote: > Daniel, (and CC to list in case others had same questions) > > > 1) Neill* machine availability > > Yes. All servers including neill* machines will be down during this > time. The very core of the entire infrastructure is being > refactored. Neill* machines do rely on some Auton infrastructure, > such as routing and networking so all machines are affected. > > > 2) BigPapa going away > > The intention is to get completely away from using /mnt/BigPapa mount > point at all. > > I anticipated that scripts would be referencing it directly. I > considered the use of symlinks, but that will only serve foster it's > survival, so at this point I am against it. > > Users could fix the scripts themselves, but instead, I plan to create > a script to crawl the file space for scripts referencing BigPapa and > fix them to point to the new locations. Nobody should have to do > anything once this script is run unless they have BigPapa hardcoded > into C code. > > > As far as the mount point change, all Auton resources will be mounted > from /auton from now on. Documentation is upcoming on this change, > and will be released before the outage begins. > > Specific to you and your students, I wasn't planning on making any > changes to the neill-fs mountpoints at this time. > > > Mike > > > > > Daniel B. Neill wrote: >> Hi Mike, could you please clarify the following: >> >> 1) Will the neill1-neill4 servers be going down as well? >> >> 2) After the downtime, how can we access data that was previously >> available in the /mnt/BigPapa directory? (Can we keep a symbolic >> link around so that we don't have to change all our scripts?) >> >> Thanks! >> Daniel >> >> >> On Wed, 23 Dec 2009, Michael J. Baysek wrote: >> >>> Hi Lab. >>> >>> >>> Please read this email in its entirety. If you have any questions >>> or concerns, you need to let me know as soon as possible. >>> >>> >>> All Auton Lab systems will be offline on Monday Dec 28 beginning at >>> 8:00 AM. This includes web servers, file servers, TCube Web >>> Interface instances, CVS, Subversion, VPN, and Linux desktops, and >>> pretty much everything else I haven't mentioned. I expect that by >>> 5:00 PM, most services will again be available. >>> When the system comes back up, we will have a new fileserver. The >>> mount points for home and data directories will be changing. >>> BigPapa will be going away. If you currently access "/mnt/BigPapa" >>> from a Linux box named gs*.sp.cs.cmu.edu, you will need to email me >>> and request the new mount points be setup correctly on your >>> machine. If you are using a machine in the .auton.cs.cmu.edu or >>> .autonlab.org domains, I will make all necessary changes for you. >>> >>> >>> Additionally, the internal IP addressing scheme will be changing. >>> For years, we have used 192.168.1.x as our internal IP range. This >>> has limited our ability to work offsite over VPN in a number of ways >>> since most home routers use this address range. This longstanding >>> issue will also be resolved during the downtime. >>> >>> >>> OUTAGE: Monday December 28 >>> WHEN: 8:00 AM to 5:00 PM (for critical services) >>> >>> >>> Mike >>> >>> >>> P.S. I recommend if you need to keep working during the outage, to >>> copy whatever you are working on locally and work on it that way. >>> Keep in mind, you won't be able to log into any BigPapa connected >>> desktops at all, so don't copy it there! >>> >>> >>> >>> >>> >>> >>> >>> >> > From mjbaysek at cs.cmu.edu Wed Dec 23 16:39:26 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Wed, 23 Dec 2009 16:39:26 -0500 Subject: [auton-users] Log out of workstations before leaving for break Message-ID: <4B328E0E.1010002@cs.cmu.edu> Hi Lab. This message goes mainly for the full-time staff and faculty at the lab who have Linux Desktops. *Be sure to log out of the desktop session on your computer before leaving for break. *If you don't log out you may lose data, as your home directory will be moved to another server. Even your desktop settings could go corrupt. If you already left and want to kill your processes and login session "somewhat nicely" you can do so by logging into your box over ssh and running the script at /mnt/BigPapa/home/public/system-upgrade-2009/kill_allmyprocs.sh . You could still lose unsaved data in your applications, but it's better than me pulling the file system out from under your running processes and desktop environments. This script can also be run on the lops to kill any VNC desktop environments or background jobs. Be aware, it will run the kill command on each job it lists. You should end the processes another way if possible before running this script. The kill_allmyprocs.sh script never kill jobs for other users or system jobs (unless you run it as root), so don't worry about that. From mjbaysek at cs.cmu.edu Sun Dec 27 18:05:11 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Sun, 27 Dec 2009 18:05:11 -0500 Subject: [auton-users] Systems Down Tomorrow Message-ID: <4B37E827.7000909@cs.cmu.edu> Hi Lab, This is a reminder that beginning at 8:00 AM tomorrow (Monday), all Auton Lab systems including all VPN connected Linux desktops will be inaccessible for the majority of the day. This time is being used to make the move to the new file server, and perform various other maintenance tasks on the infrastructure. If you need to work during the downtime, please copy your work to your local disk on your laptop, etc tonight. I will not be responding to any tech requests while I am working on the system. Best, Mike From mjbaysek at cs.cmu.edu Mon Dec 28 19:09:01 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Mon, 28 Dec 2009 19:09:01 -0500 Subject: [auton-users] System Progress: Not ready just yet Message-ID: <4B39489D.307@cs.cmu.edu> Hi Lab, As it turns out, a second day of work is required to get the system back to usable. Know that even though you can physically log in, the work is not finished, so don't run any jobs unless you want to risk them being killed. Also, desktop machines are still not available. For now, you may login to LOP1 (but please, only LOP1) to browse around. You will notice that all things Auton now live in /auton. Also, you will notice that all internal IP addresses have changed from 192.168.1.x to 192.168.6.x. In the coming days I will grep all scripts and CVS for references to BigPapa and change them all to the new location. When the dust settles on the update, you shouldn't have to change any of your scripts or C code (except for running a cvs update). More info coming on this as it happens. Mike From mjbaysek at cs.cmu.edu Mon Dec 28 19:17:40 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Mon, 28 Dec 2009 19:17:40 -0500 Subject: [auton-users] System Progress: Not ready just yet Message-ID: <4B394AA4.6060509@cs.cmu.edu> Hi Lab, As it turns out, a second day of work is required to get the system back to usable. Know that even though you can physically log in, the work is not finished, so don't run any jobs unless you want to risk them being killed. Also, desktop machines are still not available. For now, you may login to LOP1 (but please, only LOP1) to browse around. You will notice that all things Auton now live in /auton. Also, you will notice that all internal IP addresses have changed from 192.168.1.x to 192.168.6.x. In the coming days I will grep all scripts and CVS for references to BigPapa and change them all to the new location. When the dust settles on the update, you shouldn't have to change any of your scripts or C code (except for running a cvs update). More info coming on this as it happens. Mike (you may get two copies of this mail) From mjbaysek at cs.cmu.edu Mon Dec 28 19:28:46 2009 From: mjbaysek at cs.cmu.edu (Michael Baysek) Date: Mon, 28 Dec 2009 19:28:46 -0500 Subject: [auton-users] System Progress: Not ready just yet In-Reply-To: <4B394AA4.6060509@cs.cmu.edu> References: <4B394AA4.6060509@cs.cmu.edu> Message-ID: <4B394D3E.8070307@cs.cmu.edu> Many services like CVS are not working yet. I will work on things and provide more detailed status tomorrow, as at the moment I haven't enumerated the things which might not be working. Mike On 12/28/09 7:17 PM, Michael Baysek wrote: > Hi Lab, > > > As it turns out, a second day of work is required to get the system > back to usable. Know that even though you can physically log in, the > work is not finished, so don't run any jobs unless you want to risk > them being killed. Also, desktop machines are still not available. > > > For now, you may login to LOP1 (but please, only LOP1) to browse > around. You will notice that all things Auton now live in /auton. > Also, you will notice that all internal IP addresses have changed from > 192.168.1.x to 192.168.6.x. > > > In the coming days I will grep all scripts and CVS for references to > BigPapa and change them all to the new location. When the dust > settles on the update, you shouldn't have to change any of your > scripts or C code (except for running a cvs update). More info coming > on this as it happens. > > > Mike > > (you may get two copies of this mail) > From mjbaysek at cs.cmu.edu Tue Dec 29 17:43:47 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Tue, 29 Dec 2009 17:43:47 -0500 Subject: [auton-users] System Update Message-ID: <4B3A8623.40501@cs.cmu.edu> Hi Lab, The majority of the work is now complete, and the system is safe to use again. Everything Auton now lives in /auton. BigPapa is also gone, and as such, can no longer going cause your fingers to stretch for that nemesis shift key every time you type it. Your home directory is now /auton/home/yourhome. The data directory is in /auton/data, etc. You will notice that all top level /auton/* directories are all symlinks into the /auton/volumes directory. You should never use any directory in /auton/volumes/* directly, as this is where the backend disks are located. The idea behind this is that we can move, say, 'home' or 'data' to a new set of disks without moving it from the /auton/data location. You'll notice that the df command no longer accurately reports the free space. It should not be a concern, but if you like, you may check the disk space by viewing the /auton/diskspace.txt file. Also, anything put on the /auton/data or /auton/archive directory will be stored with realtime compression. This will allow us to store approximately twice as much data on the same disks. Use archive to move old or static files out of the way. This will make life easier for the backup system. The location of CVS has also changed to lop1.autonlab.org:/auton/repos/CVS . I have already updated all of your shell environments and "~h/" sandboxes to use the new CVS location, but if there are sandboxes in other places, let me know so I can make a list of them and take care of those too. If you are using CVS from a laptop. please set your environment as follows: export CVSROOT=lop1.autonlab.org:/auton/repos/CVS export CVS_RSH=ssh And replace all ~/h/*/CVS/Root files with one containing "lop1.autonlab.org:/auton/repos/CVS" There will be a few things like WebCVS and Windows file sharing which I still need to take care of but, 95% of systems are go. There will be documentation updates on the Intranet in the coming days and weeks which will explain how to get the most out of the new setup. Please email me with any questions you have in the meantime. Happy New Year! Mike From mjbaysek at cs.cmu.edu Tue Dec 29 17:59:06 2009 From: mjbaysek at cs.cmu.edu (Michael J. Baysek) Date: Tue, 29 Dec 2009 17:59:06 -0500 Subject: [auton-users] Desktops Message-ID: <4B3A89BA.70307@cs.cmu.edu> Hi again Lab, In order to get desktops up and running, I must gain physical access to them. Since Sumitra is not around over the break, this means that I will need to visit some workstations on Monday the 4th, after break. I was able to get to the desktops in the following rooms in NSH: 3117, 3119, 3123. If your machine is not in one of these rooms, then I will visit it on Monday the 4th. Please send me a request if your machine is not in NSH 3100 corridor, and suggest a time on or after the 4th you'd like me to stop by. Mike