Skip to content

CUDA error: an illegal memory access was encountered #6

@zacharykzhao

Description

@zacharykzhao

I encountered a cuda access error.

Traceback (most recent call last):
File "/Quantization/MIXQ-main/benchflops.py", line 313, in
main(args)
File "/Quantization/MIXQ-main/benchflops.py", line 268, in main
stats, model_version = run_round(
^^^^^^^^^^
File "/Quantization/MIXQ-main/benchflops.py", line 222, in run_round
raise RuntimeError(ex)

File "/Quantization/MIXQ-main/benchflops.py", line 121, in generate
out = model(inputs, use_cache=True)

File "/Quantization/MIXQ-main/mixquant/modules/linear.py", line 199, in forward
if cache.x_scale[0:M].max() > self.sigma / (( 2 ** (self.bit - 1) - 1 ) ) :

RuntimeError: CUDA error: an illegal memory access was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

I have located that the cache.x_sacle is not accessible at this point. May I have potential source of the errors?

Will the cuda toolkit version cause the error?

My envs:

CUDA Version: 12.4

torch 2.4.0+cu124
transformers 4.45.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions