GPU memory Usage

Sarveshwaran Jayaraman sarveshj at andrew.cmu.edu
Wed Apr 29 12:33:25 EDT 2020


Hi All,

I was trying to run my experiments on GPU 14 and came across this situation. On GPU ID 0 & 1 (highlighted in green& blue respectively) user has not released the GPU memory after experiment.
A possible scenario could be that the user has not shutdown the jupyter notebook after use (closing does not suffice).  Therefore 2 out of possible 4 GPUs are not available on that node.


Please be mindful to free GPU memory after use for other users if that's the case. One simple solution I found around this is to convert your notebooks to python script and run them using nohup command.

Thanks for your understanding!


(base) sarveshj at gpu14$ nvidia-smi -l 3
Wed Apr 29 11:18:10 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01    Driver Version: 440.33.01    CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 208...  Off  | 00000000:18:00.0 Off |                  N/A |
| 32%   48C    P2    59W / 250W |  10980MiB / 11019MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 208...  Off  | 00000000:3B:00.0 Off |                  N/A |
| 41%   66C    P2    98W / 250W |  10984MiB / 11019MiB |     18%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce RTX 208...  Off  | 00000000:86:00.0 Off |                  N/A |
| 33%   54C    P2    67W / 250W |   1935MiB / 11019MiB |      9%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce RTX 208...  Off  | 00000000:AF:00.0 Off |                  N/A |
| 32%   43C    P2    62W / 250W |   8849MiB / 11019MiB |      4%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    203849      C   python3                                     1677MiB |
|    1    203849      C   python3                                      155MiB |
|    1    236031      C   /home/scratch/sarveshj/mini/bin/python3    10817MiB |
|    2    203849      C   python3                                      155MiB |
|    2    232877      C   python                                      1613MiB |
|    2    236031      C   /home/scratch/sarveshj/mini/bin/python3      155MiB |
|    3    147113      C   python                                      1613MiB |
|    3    203849      C   python3                                      155MiB |
+-----------------------------------------------------------------------------+




[1562005799537]<https://www.autonlab.org/>

        Sarvesh Jayaraman<https://www.linkedin.com/in/sarveshjayaraman/>
Sr. Research Analyst, Auton Lab
Carnegie Mellon University
Mob: +1-240-893-4287


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20200429/e84b9ae0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OutlookEmoji-1562005799537128f0e9d-daf3-49bd-9c7a-70bd4af32ea2.png
Type: image/png
Size: 5461 bytes
Desc: OutlookEmoji-1562005799537128f0e9d-daf3-49bd-9c7a-70bd4af32ea2.png
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20200429/e84b9ae0/attachment.png>


More information about the Autonlab-users mailing list