TF jobs hang with GPU at P8 perf level
Predrag Punosevac
predragp at andrew.cmu.edu
Thu Oct 19 09:39:55 EDT 2017
Rui Peng <pengrui at cmu.edu> wrote:
> Hi Predrag,
>
> All my processes (on gpu3, gpu4) as been running fine until a few hours ago
> where all my newly started Tensorflow jobs could no longer make progress
> (appeared to be hang) after merely occupying mem chunks. I see 0 GPU util
> level but full mem occupancy (mem usage was intended).
> Have you had any experience with cases like above?
Nope. I am CC-ing to users at autonlab.org to see if anybody else had.
Predrag
>
> Best,
> Rui
More information about the Autonlab-users
mailing list