IMPORTANT [Re: Server load]

Artur Dubrawski awd at cs.cmu.edu
Thu Nov 14 12:23:59 EST 2019


Team,

I can't stress enough how important is being good citizens in how we
utilize our shared computing resources.

The key realization is that their capacity is not infinite, and so we all
need to play along nicely.

Please for now follow the recommendations provided by Ben below.

In the meantime, Predrag is preparing formal guidelines, to be published
shortly.

Barnabas, Jeff and I will convene to consider implementing more systematic
measures of resource consumption control, if it turns out to be necessary
due
to persistence of the problems we are facing over the past few days.

Cheers,
Artur


On Thu, Nov 14, 2019 at 11:36 AM Benedikt Boecking <boecking at andrew.cmu.edu>
wrote:

> Hi all,
>
> We are currently running into problems on several servers due to some
> users spawning too many threads. On several compute nodes we have several
> thousand threads running in parallel.
>
> To be a good lab member please:
> 1. Use top/htop and nvidia-smi to monitor your resource usage and keep it
> reasonable (memory, cpu, and gpu usage).
>
> 2.  Check that you aren’t using more threads than intended due to
> automatic multiprocessing. This happens for example with numpy and spicy
> linalg functions on the servers.You can control most of this behavior by
> setting the following variables before running scripts or notebooks,
> setting them to however many threads you want:
>
> export MKL_NUM_THREADS=1
> export NUMEXPR_NUM_THREADS=1
> export OMP_NUM_THREADS=1
>
> 3. If you have (interactive) session open that you don’t need anymore
> please close them to free memory (in particular Matlab and jupyter
> notebooks)
>
>
>
> Thanks,
> Ben
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20191114/7fb0b5c4/attachment.html>


More information about the Autonlab-users mailing list