I encountered a cuda access error.
Traceback (most recent call last):
File "/Quantization/MIXQ-main/benchflops.py", line 313, in
main(args)
File "/Quantization/MIXQ-main/benchflops.py", line 268, in main
stats, model_version = run_round(
^^^^^^^^^^
File "/Quantization/MIXQ-main/benchflops.py", line 222, in run_round
raise RuntimeError(ex)
File "/Quantization/MIXQ-main/benchflops.py", line 121, in generate
out = model(inputs, use_cache=True)
File "/Quantization/MIXQ-main/mixquant/modules/linear.py", line 199, in forward
if cache.x_scale[0:M].max() > self.sigma / (( 2 ** (self.bit - 1) - 1 ) ) :
RuntimeError: CUDA error: an illegal memory access was encountered
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
I have located that the cache.x_sacle is not accessible at this point. May I have potential source of the errors?
Will the cuda toolkit version cause the error?
My envs:
CUDA Version: 12.4
torch 2.4.0+cu124
transformers 4.45.1
I encountered a cuda access error.
Traceback (most recent call last):
File "/Quantization/MIXQ-main/benchflops.py", line 313, in
main(args)
File "/Quantization/MIXQ-main/benchflops.py", line 268, in main
stats, model_version = run_round(
^^^^^^^^^^
File "/Quantization/MIXQ-main/benchflops.py", line 222, in run_round
raise RuntimeError(ex)
File "/Quantization/MIXQ-main/benchflops.py", line 121, in generate
out = model(inputs, use_cache=True)
File "/Quantization/MIXQ-main/mixquant/modules/linear.py", line 199, in forward
if cache.x_scale[0:M].max() > self.sigma / (( 2 ** (self.bit - 1) - 1 ) ) :
RuntimeError: CUDA error: an illegal memory access was encountered
Compile with
TORCH_USE_CUDA_DSAto enable device-side assertions.I have located that the cache.x_sacle is not accessible at this point. May I have potential source of the errors?
Will the cuda toolkit version cause the error?
My envs:
CUDA Version: 12.4
torch 2.4.0+cu124
transformers 4.45.1