cublasLt  nvfp4 gemm sample code core dump error

hi, I am just  testing the performance of cublasLt GEMM with nvfp4 on RTX 5080. Basically, I followed the code in `sample_cublasLt_LtNvfp4Matmul.cu`. At first,  I change nothing, it works perfectly.
Then i tried to get a  bfloat16 tensor D,   and a core dump with error code 7  happend. 
It will be very nice of you to do me a favor~

ENV:
nvcc version: cuda_12.8.r12.8/compiler.35583870_0
driver version: 570.172.08 

here is the problem when I try to get a bfloat16 tensor D:

Situation 1:   I commented out line 86 and line 92 (just disable D_{out} quantize),  the error occured. 

Situation 2:  I commented out line 85, 86, 91 and 92 (disable per-tensor wide dequantize and D_{out} quantize), the error was gone and it works perfectly again (in attachment `code_1.cu`)

[nvfp4.zip](https://github.com/user-attachments/files/23206887/nvfp4.zip)

My question:
1. I don' know if i get the right way to use `cublasLtMatmul()` api. Or if there is a right to get full nvfp4 gemm(fp4 block scaled gemm with inner fp8 scale + per tensor fp32 scale dequant like Situation 1).

Besides, 1D Block Scaling Factors Layout[1d-block-scaling-factors-layout](https://docs.nvidia.com/cuda/archive/12.8.0/cublas/index.html#d-block-scaling-factors-layout) makes me a little confused.

could you show me a simple example in pytorch-like style to show how to get the right scale layout?( For example, shape of contiguous tensor A is (m,k) , and we have already get the contiguous scale_a with shape(m, k // 16)  )
```python
    scale_a = scale_a.view(m//128, 4,32, k//64, 4).permute(0,3,2,1,4).contiguous() # here is my understanding, but it seems wrong? 
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cublasLt nvfp4 gemm sample code core dump error #292

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cublasLt nvfp4 gemm sample code core dump error #292

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions