Skip to content

RuntimeError: CUDA error: device-side assert triggered #5

@zxy-bjtu

Description

@zxy-bjtu

Thanks for your excellent work!
But when I run the command python exp_runner.py --mode train --conf ./confs/wmask_open.conf --case real_capture_fan I got the error:
/opt/conda/conda-bld/pytorch_1614378124864/work/aten/src/ATen/native/cuda/Loss.cu:102: operator(): block: [0,0,0], thread: [0,0,0] Assertion input_val >= zero && input_val <= onefailed. …… Traceback (most recent call last): File "exp_runner.py", line 934, in <module> runner.train() File "exp_runner.py", line 204, in train loss.backward() File "/home/zxy/.conda/envs/neus/lib/python3.7/site-packages/torch/tensor.py", line 245, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/zxy/.conda/envs/neus/lib/python3.7/site-packages/torch/autograd/__init__.py", line 147, in backward allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag RuntimeError: CUDA error: device-side assert triggered
Is this a problem with the cuda device? What parameters can I adjust if I want to get it running?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions