PyTorch
Predrag Punosevac
predragp at andrew.cmu.edu
Mon Mar 26 22:50:12 EDT 2018
Manzil Zaheer <manzil at cmu.edu> wrote:
> Thanks for the detailed analysis. But I am using pytorch. I have not tried Lua torch. Can you please check? Thanks again!
>
I did. You have Python 3.6.4 in /opt/miniconda3/bin/python3.6
predrag at gpu3$ /opt/miniconda3/bin/python3.6
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Try reinstalling thing in your scratch directory as
/opt/miniconda3/bin/conda install pytorch torchvision cuda91 -c pytorch
You should see something like
The following packages will be downloaded:
package | build
---------------------------|-----------------
pillow-5.0.0 | py36h3deb7b8_0 561 KB
mkl-2018.0.2 | 1 205.2 MB
cuda91-1.0 | h4c16780_0 3 KB
pytorch
libpng-1.6.34 | hb9fc6fc_0 334 KB
freetype-2.8 | hab7d2ae_1 804 KB
libgfortran-ng-7.2.0 | hdf63c60_3 1.2 MB
intel-openmp-2018.0.0 | 8 620 KB
libtiff-4.0.9 | h28f6b97_0 586 KB
pytorch-0.3.1 |py36_cuda9.1.85_cudnn7.0.5_2 475.0
MB pytorch
torchvision-0.2.0 | py36h17b6947_1 102 KB
pytorch
jpeg-9b | h024ee3a_2 248 KB
numpy-1.14.2 | py36hdbf6ddf_0 4.0 MB
olefile-0.45.1 | py36_0 47 KB
------------------------------------------------------------
Total: 688.7 MB
Make sure you put your scratch as a path since file server is full. I
got clean installation but I didn't play further. One thing that worries
me is this line
pytorch-0.3.1 |py36_cuda9.1.85_cudnn7.0.5_2 475.0 MB
pytorch
We had problems with cudnn on 9.1 apparently because the upstream was
assuming 7.0.5 when in reality I have 7.1.1 CUDA 9 or even 7.1.5. CUDA
9.1
GPU3 has CUDNN library 7.0.5 in cuda-9.0 so try adjusting conda command
accordingly.
Best,
Predrag
>
>
> Sent from my Samsung Galaxy smartphone.
>
>
> -------- Original message --------
> From: Predrag Punosevac <predragp at andrew.cmu.edu>
> Date: 3/26/18 9:00 PM (GMT-05:00)
> To: Manzil Zaheer <manzil at cmu.edu>
> Cc: Barnabas Poczos <bapoczos at andrew.cmu.edu>, users at autonlab.org
> Subject: Re: Lua Torch
>
> Manzil Zaheer <manzil at cmu.edu> wrote:
>
> > Hi Predrag,
> >
> > I am not able to use any GPUSs on gpu5,6,7,9. I tried all 3 versions of cuda, but I get the following error:
> >
>
>
> I was able to build it after adding this
>
> export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__"
>
> per
>
> https://github.com/torch/torch7/issues/1086
>
> When I try to run it I get errors that Lua packages are missing (probably
> due to my path variables). I have a vague recollection that Simon and I
> halped you once with this thing in the past. IIRC it was very picky about
> the version of some Lua package and required their version not the one
> which comes with yum .
>
> Anyhow I am forwarding this to users at autonlab in hope somebody is using
> it and might be of more help. Please stop by NSH 3119 and let us try to
> debug this.
>
> Predrag
>
>
>
>
> > THCudaCheck FAIL file=/pytorch/torch/lib/THC/THCGeneral.c line=70 error=30 : unknown error
> > Traceback (most recent call last):
> > File "<stdin>", line 1, in <module>
> > File "/zfsauton/home/manzilz/local/lib/python3.6/site-packages/torch/cuda/__init__.py", line 384, in _lazy_new
> > _lazy_init()
> > File "/zfsauton/home/manzilz/local/lib/python3.6/site-packages/torch/cuda/__init__.py", line 142, in _lazy_init
> > torch._C._cuda_init()
> > RuntimeError: cuda runtime error (30) : unknown error at /pytorch/torch/lib/THC/THCGeneral.c:70
> >
> > Can you kindly look into it?
> >
> > Thanks,
> > Manzil
More information about the Autonlab-users
mailing list