Skip to content

cuSparseLt python bindings & performance questions #301

@dxqb

Description

@dxqb

Hi all,

  1. it would be great to have some python bindings for cuSparseLt, because it'll be a while until pytorch supports this for all dtypes, especially low-precision such as int8
  2. Using the C++ API, I'm only getting about 30% more speed for sparse int8 x int8 compared to dense torch._int_mm. Is that expected? I would have expected more given the hardware claims of twice the speed.

I'm pretty sure int8xint8 isn't bandwidth-limited on any modern GPU, is it?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions