<div dir="ltr"><div class="gmail_msg">I was just able to build tensorflow 0.11.0rc0 on gpu3! I used the cuda 8.0 install, and it built fine. So additionally installing 7.5 was probably not necessary; in fact, cuda 7.5 doesn't know about the 6.1 compute architecture that the Titan Xs use, so Theano at least needs to be manually told to use an older architecture.</div><div class="gmail_msg"><br></div><div class="gmail_msg">A pip package is in ~dsutherl/tensorflow-0.11.0rc0-py2-none-any.whl. I think it should work fine with the cudnn in my scratch directory.</div><div class="gmail_msg"><br></div><div class="gmail_msg">You should probably install it to scratch, either running this first to put libraries your scratch directory or using a virtualenv or something:</div><div class="gmail_msg">export PYTHONUSERBASE=/home/scratch/$USER/.local</div><div class="gmail_msg"><br></div><div class="gmail_msg">You'll need this to use the library and probably to install it:</div><div class="gmail_msg">export LD_LIBRARY_PATH=/home/scratch/dsutherl/cudnn-8.0-5.1/cuda/lib64:"$LD_LIBRARY_PATH"</div><div class="gmail_msg"><br></div><div class="gmail_msg">To install:</div><div class="gmail_msg">pip install --user ~dsutherl/tensorflow-0.11.0rc0-py2-none-any.whl</div><div class="gmail_msg">(remove --user if you're using a virtualenv)</div><div class="gmail_msg"><br></div><div class="gmail_msg">(A request: I'm submitting to ICLR in two weeks, and for some of the models I'm running gpu3's cards are 4x the speed of gpu1 or 2's. So please don't run a ton of stuff on gpu3 unless you're working on a deadline too.</div><div class="gmail_msg"><br></div><div class="gmail_msg"><br></div><div class="gmail_msg"><br></div><div class="gmail_msg">Steps to install it, for the future:</div><div class="gmail_msg"><ul><li>Install bazel in your home directory:</li><ul><li><div class="gmail_msg">wget<span class="inbox-inbox-Apple-converted-space"> </span><a href="https://github.com/bazelbuild/bazel/releases/download/0.3.2/bazel-0.3.2-installer-linux-x86_64.sh">https://github.com/bazelbuild/bazel/releases/download/0.3.2/bazel-0.3.2-installer-linux-x86_64.sh</a><br></div></li><li><div class="gmail_msg">bash<span class="inbox-inbox-Apple-converted-space"> </span><a href="http://bazel-0.3.2-installer-linux-x86_64.sh/" class="gmail_msg" target="_blank">bazel-0.3.2-installer-linux-x86_64.sh</a><span class="inbox-inbox-Apple-converted-space"> </span>--prefix=/home/scratch/$USER --base=/home/scratch/$USER/.bazel</div></li></ul><li>Configure bazel to build in scratch. There's probably a better way to do this, but this works:</li><ul><li>mkdir /home/scratch/$USER/.cache</li><li>ln -s /home/scratch/$USER/.cache/bazel ~/.cache/bazel</li></ul><li>Build tensorflow. Note that builds from git checkouts don't work, because they assume a newer version of git than is on gpu3:</li><ul><li>cd /home/scratch/$USER</li><li>wget</li><li>tar xf </li><li>cd tensorflow-0.11.0rc0</li><li>./configure</li><ul><li>This is an interactive script that doesn't seem to let you pass arguments or anything. It's obnoxious.</li><li>Use the default python</li><li>don't use cloud platform or hadoop file system</li><li>use the default site-packages path if it asks</li><li>build with GPU support</li><li>default gcc</li><li>default Cuda SDK version</li><li>specify /usr/local/cuda-8.0</li><li>default cudnn version</li><li>specify $CUDNN_DIR from use-cudnn.sh, e.g. /home/scratch/dsutherl/cudnn-8.0-5.1/cuda</li><li>Pascal Titan Xs have compute capability 6.1</li></ul><li>bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package</li><li>bazel-bin/tensorflow/tools/pip_package/build_pip_package ./</li><li>A .whl file, e.g. tensorflow-0.11.0rc0-py2-none-any.whl, is put in the directory you specified above.</li></ul></ul></div><div class="gmail_msg"><br></div><div class="gmail_msg">- Dougal</div><div class="gmail_msg"><br></div><br class="gmail_msg"><div class="gmail_quote gmail_msg"><div dir="ltr" class="gmail_msg">On Fri, Oct 21, 2016 at 6:14 PM Kirthevasan Kandasamy <<a href="mailto:kandasamy@cmu.edu" class="gmail_msg" target="_blank">kandasamy@cmu.edu</a>> wrote:<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">Predrag,<div class="gmail_msg"><br class="gmail_msg"></div><div class="gmail_msg">Any updates on gpu3?</div><div class="gmail_msg">I have tried both tensorflow and chainer and in both cases the problem seems to be with cuda</div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Wed, Oct 19, 2016 at 4:10 PM, Predrag Punosevac <span dir="ltr" class="gmail_msg"><<a href="mailto:predragp@cs.cmu.edu" class="gmail_msg" target="_blank">predragp@cs.cmu.edu</a>></span> wrote:<br class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dougal Sutherland <<a href="mailto:dougal@gmail.com" class="gmail_msg" target="_blank">dougal@gmail.com</a>> wrote:<br class="gmail_msg">
<br class="gmail_msg">
> I tried for a while. I failed.<br class="gmail_msg">
><br class="gmail_msg">
<br class="gmail_msg">
Damn this doesn't look good. I guess back to the drawing board. Thanks<br class="gmail_msg">
for the quick feed back.<br class="gmail_msg">
<br class="gmail_msg">
Predrag<br class="gmail_msg">
<br class="gmail_msg">
> Version 0.10.0 fails immediately on build: "The specified --crosstool_top<br class="gmail_msg">
> '@local_config_cuda//crosstool:crosstool' is not a valid cc_toolchain_suite<br class="gmail_msg">
> rule." Apparently this is because 0.10 required an older version of bazel (<br class="gmail_msg">
> <a href="https://github.com/tensorflow/tensorflow/issues/4368" rel="noreferrer" class="gmail_msg" target="_blank">https://github.com/tensorflow/tensorflow/issues/4368</a>), and I don't have the<br class="gmail_msg">
> energy to install an old version of bazel.<br class="gmail_msg">
><br class="gmail_msg">
> Version 0.11.0rc0 gets almost done and then complains about no such file or<br class="gmail_msg">
> directory for libcudart.so.7.5 (which is there, where I told tensorflow it<br class="gmail_msg">
> was...).<br class="gmail_msg">
><br class="gmail_msg">
> Non-release versions from git fail immediately because they call git -C to<br class="gmail_msg">
> get version info, which is only in git 1.9 (we have 1.8).<br class="gmail_msg">
><br class="gmail_msg">
><br class="gmail_msg">
> Some other notes:<br class="gmail_msg">
> - I made a symlink from ~/.cache/bazel to /home/scratch/$USER/.cache/bazel,<br class="gmail_msg">
> because bazel is the worst. (It complains about doing things on NFS, and<br class="gmail_msg">
> hung for me [clock-related?], and I can't find a global config file or<br class="gmail_msg">
> anything to change that in; it seems like there might be one, but their<br class="gmail_msg">
> documentation is terrible.)<br class="gmail_msg">
><br class="gmail_msg">
> - I wasn't able to use the actual Titan X compute capability of 6.1,<br class="gmail_msg">
> because that requires cuda 8; I used 5.2 instead. Probably not a huge deal,<br class="gmail_msg">
> but I don't know.<br class="gmail_msg">
><br class="gmail_msg">
> - I tried explicitly including /usr/local/cuda/lib64 in LD_LIBRARY_PATH and<br class="gmail_msg">
> set CUDA_HOME to /usr/local/cuda before building, hoping that would help<br class="gmail_msg">
> with the 0.11.0rc0 problem, but it didn't.<br class="gmail_msg">
</blockquote></div><br class="gmail_msg"></div>
</blockquote></div></div>