Skip to content

replace kernel implementation using CK tile-programming performant kernels #33

@carlushuang

Description

@carlushuang

We are planning to replace the underneath kernel implementation with the newly developed CK tile-programming fmha kernel. The performance is much better for MI200/MI300, especially for MI300 cases. After this is done, the current implementation in main branch will be deprecated.

  • fwd integration with hdim=64/128, support mask, varlen, different kernels for padding case.
  • fwd extend to other hdims
  • dropout support
  • bwd integration (to be planed)

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions