[auton-users] LOP1: PLEASE DO NOT Run jobs there!

Michael Baysek mjbaysek at cs.cmu.edu
Sun Mar 7 06:49:46 EST 2010


Hi Lab,

Someone almost just took down LOP1 by overloading it.  I am
taking the opportunity to stress the importance of keeping
LOP1 free of jobs.

LOP1's primary job is to be the access node for the rest of
the machines.  Here are some valid uses that fall under that
category:

1) SSHing to LOP1 in order to SSH to another compute node.

2) Using LOP1 to scp data into the /auton or /neill space.

3) Using it for SSH tunneling CVS (most of you do this without
    even knowing it.)

4) Running only very light duty, low memory scripts and short
    running processes.

WHY?

1) Everyone is accessing all computing resources through LOP1
    (or LOP2 as a backup, but mostly LOP1.)

2) Overloading LOP1 disconnects everyone from the cluster!

3) Any processes or sessions that people forgot to run in screen
    will be lost on all machines!

4) Many people tunnel X11 apps on other nodes over ssh through
    LOP1.  You will, without fail, cause that person's program to
    die.

5) If you run big jobs on LOP1 and I see them, I kill them.







More information about the Autonlab-users mailing list