a note about tensorflow and gpu memory usage

Fri Oct 28 13:04:24 EDT 2016

Hi all,

Something that's not necessarily obvious to everyone about tensorflow: if
you just run something with tensorflow, it will by default allocate all of
the memory on *all* GPUs on the machine. It's pretty unlikely that whatever
model you're running is going to need all 48 GB in all 4 cards on gpu{2,3}.
:)

To stop this behavior, set the environment variable CUDA_VISIBLE_DEVICES to
only show tensorflow the relevant devices. For example,
 "CUDA_VISIBLE_DEVICES=0 python" will then have that tensorflow session use
only gpu0. You can check what devices are free with nvidia-smi.  Theano
will pick a single gpu to use by default; to choose a specific one, use
THEANO_FLAGS=device=gpu0.

If you're running small models and want to run more than one on a single
gpu, you can tell tensorflow to avoid allocating all of a GPU's memory with
the methods discussed here <http://stackoverflow.com/q/34199233/344821>.
Setting per_process_gpu_memory_fraction lets it allocate a certain portion
of the GPU's memory; setting allow_growth=True makes it only claim memory
as it needs it.

Theano's default behavior is similar to allow_growth=True; you can make it
preallocate memory (and often get substantial speedups) with
THEANO_FLAGS=device=gpu0,lib.cnmem=1. (lib.cnmem=.5 will allocate half the
GPU's memory; lib.cnmem=1024 will allocate 1GB.)

- Dougal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20161028/0a3a2765/attachment.html>