<p dir="ltr">They do work with 7.5 if you specify an older compute architecture; it's just that their actual compute capability of 6.1 isn't supported by cuda 7.5. Thank is thrown off by this, for example, but it can be fixed by telling it to pass compute capability 5.2 (for example) to nvcc. I don't think that this was my problem with building tensorflow on 7.5; I'm not sure what that was.</p>
<br><div class="gmail_quote"><div dir="ltr">On Fri, Oct 21, 2016, 8:11 PM Kirthevasan Kandasamy <<a href="mailto:kandasamy@cmu.edu">kandasamy@cmu.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr" class="gmail_msg">Thanks Dougal. I'll take a look atthis and get back to you.<div class="gmail_msg">So are you suggesting that this is an issue with TitanX's not being compatible with 7.5?</div></div><div class="gmail_extra gmail_msg"><br class="gmail_msg"><div class="gmail_quote gmail_msg">On Fri, Oct 21, 2016 at 3:08 PM, Dougal Sutherland <span dir="ltr" class="gmail_msg"><<a href="mailto:dougal@gmail.com" class="gmail_msg" target="_blank">dougal@gmail.com</a>></span> wrote:<br class="gmail_msg"><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr" class="gmail_msg">I installed it in my scratch directory (not sure if there's a global install?). The main thing was to put its cache on scratch; it got really upset when the cache directory was on NFS. (Instructions at the bottom of my previous email.)</p><div class="m_-7882656025475622117HOEnZb gmail_msg"><div class="m_-7882656025475622117h5 gmail_msg">
<br class="gmail_msg"><div class="gmail_quote gmail_msg"><div dir="ltr" class="gmail_msg">On Fri, Oct 21, 2016, 8:04 PM Barnabas Poczos <<a href="mailto:bapoczos@cs.cmu.edu" class="gmail_msg" target="_blank">bapoczos@cs.cmu.edu</a>> wrote:<br class="gmail_msg"></div><blockquote class="gmail_quote gmail_msg" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">That's great! Thanks Dougal.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
As I remember bazel was not installed correctly previously on GPU3. Do<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
you know what went wrong with it before and why it is good now?<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Thanks,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Barnabas<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
======================<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Barnabas Poczos, PhD<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Assistant Professor<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Machine Learning Department<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
Carnegie Mellon University<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
On Fri, Oct 21, 2016 at 2:03 PM, Dougal Sutherland <<a href="mailto:dougal@gmail.com" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">dougal@gmail.com</a>> wrote:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> I was just able to build tensorflow 0.11.0rc0 on gpu3! I used the cuda 8.0<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> install, and it built fine. So additionally installing 7.5 was probably not<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> necessary; in fact, cuda 7.5 doesn't know about the 6.1 compute architecture<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> that the Titan Xs use, so Theano at least needs to be manually told to use<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> an older architecture.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> A pip package is in ~dsutherl/tensorflow-0.11.0rc0-py2-none-any.whl. I think<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> it should work fine with the cudnn in my scratch directory.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> You should probably install it to scratch, either running this first to put<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> libraries your scratch directory or using a virtualenv or something:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> export PYTHONUSERBASE=/home/scratch/$USER/.local<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> You'll need this to use the library and probably to install it:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> export<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> LD_LIBRARY_PATH=/home/scratch/dsutherl/cudnn-8.0-5.1/cuda/lib64:"$LD_LIBRARY_PATH"<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> To install:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> pip install --user ~dsutherl/tensorflow-0.11.0rc0-py2-none-any.whl<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> (remove --user if you're using a virtualenv)<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> (A request: I'm submitting to ICLR in two weeks, and for some of the models<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> I'm running gpu3's cards are 4x the speed of gpu1 or 2's. So please don't<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> run a ton of stuff on gpu3 unless you're working on a deadline too.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Steps to install it, for the future:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Install bazel in your home directory:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> wget<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> <a href="https://github.com/bazelbuild/bazel/releases/download/0.3.2/bazel-0.3.2-installer-linux-x86_64.sh" rel="noreferrer" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">https://github.com/bazelbuild/bazel/releases/download/0.3.2/bazel-0.3.2-installer-linux-x86_64.sh</a><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> bash <a href="http://bazel-0.3.2-installer-linux-x86_64.sh" rel="noreferrer" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">bazel-0.3.2-installer-linux-x86_64.sh</a> --prefix=/home/scratch/$USER<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> --base=/home/scratch/$USER/.bazel<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Configure bazel to build in scratch. There's probably a better way to do<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> this, but this works:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> mkdir /home/scratch/$USER/.cache<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> ln -s /home/scratch/$USER/.cache/bazel ~/.cache/bazel<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Build tensorflow. Note that builds from git checkouts don't work, because<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> they assume a newer version of git than is on gpu3:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> cd /home/scratch/$USER<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> wget<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> tar xf<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> cd tensorflow-0.11.0rc0<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> ./configure<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> This is an interactive script that doesn't seem to let you pass arguments or<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> anything. It's obnoxious.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Use the default python<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> don't use cloud platform or hadoop file system<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> use the default site-packages path if it asks<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> build with GPU support<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> default gcc<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> default Cuda SDK version<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> specify /usr/local/cuda-8.0<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> default cudnn version<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> specify $CUDNN_DIR from use-cudnn.sh, e.g.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> /home/scratch/dsutherl/cudnn-8.0-5.1/cuda<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> Pascal Titan Xs have compute capability 6.1<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> bazel build -c opt --config=cuda<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> //tensorflow/tools/pip_package:build_pip_package<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> bazel-bin/tensorflow/tools/pip_package/build_pip_package ./<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> A .whl file, e.g. tensorflow-0.11.0rc0-py2-none-any.whl, is put in the<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> directory you specified above.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> - Dougal<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> On Fri, Oct 21, 2016 at 6:14 PM Kirthevasan Kandasamy <<a href="mailto:kandasamy@cmu.edu" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">kandasamy@cmu.edu</a>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
> wrote:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> Predrag,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> Any updates on gpu3?<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> I have tried both tensorflow and chainer and in both cases the problem<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> seems to be with cuda<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> On Wed, Oct 19, 2016 at 4:10 PM, Predrag Punosevac <<a href="mailto:predragp@cs.cmu.edu" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">predragp@cs.cmu.edu</a>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>> wrote:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> Dougal Sutherland <<a href="mailto:dougal@gmail.com" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">dougal@gmail.com</a>> wrote:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > I tried for a while. I failed.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> Damn this doesn't look good. I guess back to the drawing board. Thanks<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> for the quick feed back.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> Predrag<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > Version 0.10.0 fails immediately on build: "The specified<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > --crosstool_top<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > '@local_config_cuda//crosstool:crosstool' is not a valid<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > cc_toolchain_suite<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > rule." Apparently this is because 0.10 required an older version of<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > bazel (<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > <a href="https://github.com/tensorflow/tensorflow/issues/4368" rel="noreferrer" class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg" target="_blank">https://github.com/tensorflow/tensorflow/issues/4368</a>), and I don't have<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > the<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > energy to install an old version of bazel.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > Version 0.11.0rc0 gets almost done and then complains about no such<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > file or<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > directory for libcudart.so.7.5 (which is there, where I told tensorflow<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > it<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > was...).<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > Non-release versions from git fail immediately because they call git -C<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > to<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > get version info, which is only in git 1.9 (we have 1.8).<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > Some other notes:<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > - I made a symlink from ~/.cache/bazel to<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > /home/scratch/$USER/.cache/bazel,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > because bazel is the worst. (It complains about doing things on NFS,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > and<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > hung for me [clock-related?], and I can't find a global config file or<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > anything to change that in; it seems like there might be one, but their<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > documentation is terrible.)<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > - I wasn't able to use the actual Titan X compute capability of 6.1,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > because that requires cuda 8; I used 5.2 instead. Probably not a huge<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > deal,<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > but I don't know.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> ><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > - I tried explicitly including /usr/local/cuda/lib64 in LD_LIBRARY_PATH<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > and<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > set CUDA_HOME to /usr/local/cuda before building, hoping that would<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > help<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>>> > with the 0.11.0rc0 problem, but it didn't.<br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
>><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
><br class="m_-7882656025475622117m_-6269736360502246421gmail_msg gmail_msg">
</blockquote></div>
</div></div></blockquote></div><br class="gmail_msg"></div>
</blockquote></div>