CUDA Error

Yichong Xu yichongx at cs.cmu.edu
Sat Sep 1 12:58:13 EDT 2018


Hi,
I’m having the same problem here - @ Vincent have you figured out how to fix this?
>>> import torch
>>> a=torch.zeros(4,4)
>>> a.cuda()
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/THCTensorRandom.cu line=25 error=30 : unknown error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: cuda runtime error (30) : unknown error at /opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/THCTensorRandom.cu:25

Previously I can use pytorch without error.

Thanks,
Yichong



From: Autonlab-users <autonlab-users-bounces at autonlab.org> On Behalf Of Jayanth Koushik
Sent: 2018年8月31日 11:34
To: Predrag Punosevac <predragp at andrew.cmu.edu>
Cc: users at autonlab.org
Subject: Re: CUDA Error

The last line of the error refers to a different conda. Can you make sure all paths are correct?
~Jayanth

On Aug 31, 2018, at 11:23 AM, Predrag Punosevac <predragp at andrew.cmu.edu<mailto:predragp at andrew.cmu.edu>> wrote:
Vincent Jeanselme <vjeansel at andrew.cmu.edu<mailto:vjeansel at andrew.cmu.edu>> wrote:


Good Morning,

Lets try users at autonlab.org<mailto:users at autonlab.org>


Predrag



Since the change of the hard drive, I have the following error when I
run it on the GPUs (I have reinstalled pytorch but does not solve my
problem). I think that the problem comes from the Cuda library.

   THCudaCheck FAIL
   file=/opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu
   line=25 error=30 : unknown error
   Traceback (most recent call last):
    ?? File "./train.py", line 519, in <module>
    ?????? main(args)
    ?? File "./train.py", line 61, in main
    ?????? model = nn.DataParallel(model).cuda()
    ?? File
   "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
   line 102, in __init__
    ?????? _check_balance(self.device_ids)
    ?? File
   "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py",
   line 17, in _check_balance
    ?????? dev_props = [torch.cuda.get_device_properties(i) for i in
   device_ids]
    ?? File
   "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
   line 290, in get_device_properties
    ?????? init()?? # will define _get_device_properties and
   _CudaDeviceProperties
    ?? File
   "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
   line 143, in init
    ?????? _lazy_init()
    ?? File
   "/zfsauton/home/vjeanselme/anaconda3/envs/lstmpy27/lib/python2.7/site-packages/torch/cuda/__init__.py",
   line 161, in _lazy_init
    ?????? torch._C._cuda_init()
   RuntimeError: cuda runtime error (30) : unknown error at
   /opt/conda/conda-bld/pytorch_1524577177097/work/aten/src/THC/THCTensorRandom.cu:25

I don't know how to fix it, would you have any suggestions ?

Thank you,

--
Vincent Jeanselme
-----------------
Analyst Researcher
Auton Lab - Robotics Institute
Carnegie Mellon University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20180901/fc9cd1c2/attachment-0001.html>


More information about the Autonlab-users mailing list