<div dir="ltr">It's possible that you followed some instructions I sent a while ago and are using your own version of cudnn. Try "echo $LD_LIBRARY_PATH" and make sure it only has things in /usr/local, /usr/lib64 (nothing in your own directories), and make sure that your python code doesn't change that....<div><br></div><div>The Anaconda python distribution now distributes cudnn and tensorflow-gpu, so you could also install that in your scratch dir to have your own install. But they only have tensorflow 1.0 and higher, so your old code would require some changes (system install on gpu1 is 0.10, and there were breaking changes in both 1.0 and 1.1).</div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, May 12, 2017 at 4:55 PM Dougal Sutherland <<a href="mailto:dougal@gmail.com">dougal@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">It works for me too, not in IPython. Try this:<div><br></div><div>CUDA_VISIBLE_DEVICES=5 python -c 'import tensorflow as tf; tf.InteractiveSession()'</div></div><br><div class="gmail_quote"><div dir="ltr">On Fri, May 12, 2017 at 4:55 PM Kirthevasan Kandasamy <<a href="mailto:kandasamy@cmu.edu" target="_blank">kandasamy@cmu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">No, I don't use iPython.</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, May 12, 2017 at 11:22 AM, <span dir="ltr"><<a href="mailto:chiragn@andrew.cmu.edu" target="_blank">chiragn@andrew.cmu.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Have you tried running it from with iPython notebook as an interactive<br>
session?<br>
<br>
I am doing that right now and it works.<br>
<br>
Chirag<br>
<div class="m_-4840874716259804677m_424370253774490668HOEnZb"><div class="m_-4840874716259804677m_424370253774490668h5"><br>
<br>
> Kirthevasan Kandasamy <<a href="mailto:kandasamy@cmu.edu" target="_blank">kandasamy@cmu.edu</a>> wrote:<br>
><br>
>> Hi Predrag,<br>
>><br>
>> I am re-running a tensorflow project on GPU1 - I haven't touched it in<br>
>> 4/5<br>
>> months, and the last time I ran it it worked fine, but when I try now I<br>
>> seem to be getting the following error.<br>
>><br>
><br>
> This is the first time I hear about it. I was under impression that GPU<br>
> nodes were usable. I am redirecting your e-mail to <a href="mailto:users@autonlab.org" target="_blank">users@autonlab.org</a><br>
> in the hope that somebody who is using TensorFlow on the regular basis<br>
> can be of more help.<br>
><br>
> Predrag<br>
><br>
><br>
><br>
><br>
>> Can you please tell me what the issue might be or direct me to someone<br>
>> who<br>
>> might know?<br>
>><br>
>> This is for the NIPS deadline, so I would appreciate a quick response.<br>
>><br>
>> thanks,<br>
<br>
>> Samy<br>
>><br>
>><br>
>> I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0<br>
>> with<br>
>> properties:<br>
>> name: Tesla K80<br>
>> major: 3 minor: 7 memoryClockRate (GHz) 0.8235<br>
>> pciBusID 0000:05:00.0<br>
>> Total memory: 11.17GiB<br>
>> Free memory: 11.11GiB<br>
>> I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0<br>
>> I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y<br>
>> I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating<br>
>> TensorFlow<br>
>> device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id:<br>
>> 0000:05:00.0)<br>
>> E tensorflow/stream_executor/cuda/cuda_dnn.cc:347] Loaded runtime CuDNN<br>
>> library: 4007 (compatibility version 4000) but source was compiled with<br>
>> 5103 (compatibility version 5100). If using a binary install, upgrade<br>
>> your<br>
>> CuDNN library to match. If building from sources, make sure the library<br>
>> loaded at runtime matches a compatible version specified during compile<br>
>> configuration.<br>
>> F tensorflow/core/kernels/conv_ops.cc:457] Check failed:<br>
>> stream->parent()->GetConvolveAlgorithms(&algorithms)<br>
>> run_resnet.sh: line 49: 22665 Aborted (core dumped)<br>
>> CUDA_VISIBLE_DEVICES=$GPU python ../resnettf/resnet_main.py --data_dir<br>
>> $DATA_DIR --max_batch_iters $NUM_ITERS --report_results_every<br>
>> $REPORT_RESULTS_EVERY --log_root $LOG_ROOT --dataset $DATASET --num_gpus<br>
>> 1<br>
>> --save_model_dir $SAVE_MODEL_DIR --save_model_every $SAVE_MODEL_EVERY<br>
>> --skip_add_method $SKIP_ADD_METHOD --architecture $ARCHITECTURE<br>
>> --skip_size<br>
>> $SKIP_SIZE<br>
><br>
<br>
<br>
</div></div></blockquote></div><br></div>
</blockquote></div></blockquote></div>