CUDA Error

Jayanth Koushik jkoushik at andrew.cmu.edu
Fri Aug 31 11:34:23 EDT 2018


The last line of the error refers to a different conda. Can you make sure all paths are correct?

~Jayanth

On Aug 31, 2018, at 11:23 AM, Predrag Punosevac <predragp at andrew.cmu.edu> wrote:

Vincent Jeanselme <vjeansel at andrew.cmu.edu> wrote:

> Good Morning,

Lets try users at autonlab.org


Predrag

> 
> Since the change of the hard drive, I have the following error when I 
> run it on the GPUs (I have reinstalled pytorch but does not solve my 
> problem). I think that the problem comes from the Cuda library.
> 
>    THCudaCheck FAIL
>    file=/opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu
>    line=25 error=30 : unknown error
>    Traceback (most recent call last):
>     ?? File "./train.py", line 519, in <module>
>     ?????? main(args)
>     ?? File "./train.py", line 61, in main
>     ?????? model = nn.DataParallel(model).cuda()
>     ?? File
>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
>    line 102, in __init__
>     ?????? _check_balance(self.device_ids)
>     ?? File
>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
>    line 17, in _check_balance
>     ?????? dev_props = [torch.cuda.get_device_properties(i) for i in
>    device_ids]
>     ?? File
>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>    line 290, in get_device_properties
>     ?????? init()?? # will define _get_device_properties and
>    _CudaDeviceProperties
>     ?? File
>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>    line 143, in init
>     ?????? _lazy_init()
>     ?? File
>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>    line 161, in _lazy_init
>     ?????? torch._C._cuda_init()
>    RuntimeError: cuda runtime error (30) : unknown error at
>    /opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu:25
> 
> I don't know how to fix it, would you have any suggestions ?
> 
> Thank you,
> 
> -- 
> Vincent Jeanselme
> -----------------
> Analyst Researcher
> Auton Lab - Robotics Institute
> Carnegie Mellon University
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20180831/33d4ed93/attachment-0001.html>


More information about the Autonlab-users mailing list