GPU memory Usage
Sarveshwaran Jayaraman
sarveshj at andrew.cmu.edu
Wed Apr 29 12:33:25 EDT 2020
Hi All,
I was trying to run my experiments on GPU 14 and came across this situation. On GPU ID 0 & 1 (highlighted in green& blue respectively) user has not released the GPU memory after experiment.
A possible scenario could be that the user has not shutdown the jupyter notebook after use (closing does not suffice). Therefore 2 out of possible 4 GPUs are not available on that node.
Please be mindful to free GPU memory after use for other users if that's the case. One simple solution I found around this is to convert your notebooks to python script and run them using nohup command.
Thanks for your understanding!
(base) sarveshj at gpu14$ nvidia-smi -l 3
Wed Apr 29 11:18:10 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:18:00.0 Off | N/A |
| 32% 48C P2 59W / 250W | 10980MiB / 11019MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:3B:00.0 Off | N/A |
| 41% 66C P2 98W / 250W | 10984MiB / 11019MiB | 18% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce RTX 208... Off | 00000000:86:00.0 Off | N/A |
| 33% 54C P2 67W / 250W | 1935MiB / 11019MiB | 9% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce RTX 208... Off | 00000000:AF:00.0 Off | N/A |
| 32% 43C P2 62W / 250W | 8849MiB / 11019MiB | 4% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 203849 C python3 1677MiB |
| 1 203849 C python3 155MiB |
| 1 236031 C /home/scratch/sarveshj/mini/bin/python3 10817MiB |
| 2 203849 C python3 155MiB |
| 2 232877 C python 1613MiB |
| 2 236031 C /home/scratch/sarveshj/mini/bin/python3 155MiB |
| 3 147113 C python 1613MiB |
| 3 203849 C python3 155MiB |
+-----------------------------------------------------------------------------+
[1562005799537]<https://www.autonlab.org/>
Sarvesh Jayaraman<https://www.linkedin.com/in/sarveshjayaraman/>
Sr. Research Analyst, Auton Lab
Carnegie Mellon University
Mob: +1-240-893-4287
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20200429/e84b9ae0/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OutlookEmoji-1562005799537128f0e9d-daf3-49bd-9c7a-70bd4af32ea2.png
Type: image/png
Size: 5461 bytes
Desc: OutlookEmoji-1562005799537128f0e9d-daf3-49bd-9c7a-70bd4af32ea2.png
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20200429/e84b9ae0/attachment.png>
More information about the Autonlab-users
mailing list