CUDA Error

Elena Giusarma elenagiusarma at gmail.com
Tue Sep 4 15:39:25 EDT 2018


Hi,

I am having this error,

   net.cuda(3)
 File
"/zfsauton/home/egiusarm/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py",
line 258, in cuda
   return self._apply(lambda t: t.cuda(device))
 File
"/zfsauton/home/egiusarm/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py",
line 185, in _apply
   module._apply(fn)
 File
"/zfsauton/home/egiusarm/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py",
line 191, in _apply
   param.data = fn(param.data)
 File
"/zfsauton/home/egiusarm/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/nn/modules/module.py",
line 258, in <lambda>
   return self._apply(lambda t: t.cuda(device))
RuntimeError: CUDA error: unknown error

I never had that error before. I always used pytorch without problems.


thanks,
Elena


Il giorno lun 3 set 2018 alle ore 09:43 Emre Yolcu <eyolcu at cs.cmu.edu> ha
scritto:

> I'm getting the same error.
>
> On Sat, Sep 1, 2018 at 12:58 PM, Yichong Xu <yichongx at cs.cmu.edu> wrote:
>
>> Hi,
>>
>> I’m having the same problem here - @ Vincent have you figured out how to
>> fix this?
>>
>> >>> import torch
>>
>> >>> a=torch.zeros(4,4)
>>
>> >>> a.cuda()
>>
>> THCudaCheck FAIL
>> file=/opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/THCTensorRandom.cu
>> line=25 error=30 : unknown error
>>
>> Traceback (most recent call last):
>>
>>   File "<stdin>", line 1, in <module>
>>
>> RuntimeError: cuda runtime error (30) : unknown error at
>> /opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/THCTensorRandom.cu:25
>>
>>
>>
>> Previously I can use pytorch without error.
>>
>>
>>
>> *Thanks,*
>>
>> *Yichong*
>>
>>
>>
>>
>>
>>
>>
>> *From:* Autonlab-users <autonlab-users-bounces at autonlab.org> *On Behalf
>> Of *Jayanth Koushik
>> *Sent:* 2018年8月31日 11:34
>> *To:* Predrag Punosevac <predragp at andrew.cmu.edu>
>> *Cc:* users at autonlab.org
>> *Subject:* Re: CUDA Error
>>
>>
>>
>> The last line of the error refers to a different conda. Can you make sure
>> all paths are correct?
>>
>> ~Jayanth
>>
>>
>> On Aug 31, 2018, at 11:23 AM, Predrag Punosevac <predragp at andrew.cmu.edu>
>> wrote:
>>
>> Vincent Jeanselme <vjeansel at andrew.cmu.edu> wrote:
>>
>>
>> Good Morning,
>>
>>
>> Lets try users at autonlab.org
>>
>>
>> Predrag
>>
>>
>>
>>
>> Since the change of the hard drive, I have the following error when I
>>
>> run it on the GPUs (I have reinstalled pytorch but does not solve my
>>
>> problem). I think that the problem comes from the Cuda library.
>>
>>
>>
>>    THCudaCheck FAIL
>>
>>
>>    file=/opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu
>>
>>    line=25 error=30 : unknown error
>>
>>    Traceback (most recent call last):
>>
>>     ?? File "./train.py", line 519, in <module>
>>
>>     ?????? main(args)
>>
>>     ?? File "./train.py", line 61, in main
>>
>>     ?????? model = nn.DataParallel(model).cuda()
>>
>>     ?? File
>>
>>
>>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
>>
>>    line 102, in __init__
>>
>>     ?????? _check_balance(self.device_ids)
>>
>>     ?? File
>>
>>
>>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
>>
>>    line 17, in _check_balance
>>
>>     ?????? dev_props = [torch.cuda.get_device_properties(i) for i in
>>
>>    device_ids]
>>
>>     ?? File
>>
>>
>>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>>
>>    line 290, in get_device_properties
>>
>>     ?????? init()?? # will define _get_device_properties and
>>
>>    _CudaDeviceProperties
>>
>>     ?? File
>>
>>
>>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>>
>>    line 143, in init
>>
>>     ?????? _lazy_init()
>>
>>     ?? File
>>
>>
>>    "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
>>
>>    line 161, in _lazy_init
>>
>>     ?????? torch._C._cuda_init()
>>
>>    RuntimeError: cuda runtime error (30) : unknown error at
>>
>>
>>    /opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu:25
>>
>>
>>
>> I don't know how to fix it, would you have any suggestions ?
>>
>>
>>
>> Thank you,
>>
>>
>>
>> --
>>
>> Vincent Jeanselme
>>
>> -----------------
>>
>> Analyst Researcher
>>
>> Auton Lab - Robotics Institute
>>
>> Carnegie Mellon University
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20180904/05032137/attachment-0001.html>


More information about the Autonlab-users mailing list