fix: remove custom scaled bmm op on cpu and fix fp8 test by andrea-fasoli · Pull Request #187 · foundation-model-stack/fms-model-optimizer

andrea-fasoli · 2025-10-14T21:02:56Z

Description of the change

Removing CPU version of custom OP scaled_mm_out which has been introduced in the ATen set from torch 2.8.
We will be using torch native scaled_mm_out version when running on CPU from now on.

Also, fixing bug in FP8 unit test giving rise to size mismatch error (note: this test is typically not run as part of CI/CD because it requires torchao and fms installation, as well as GPU with compute >= 8.0).

Related issues or PRs

Relates to torch 2.8 enablement for FMS-MO discussed at: #177

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli · 2025-10-15T18:32:17Z

needs fix for backward compatibility with torch <= 2.7

fms_mo/aiu_addons/fp8/fp8_spyre_op.py

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Remove custom scaled bmm op on cpu and fix fp8 test

eaad33b

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

andrea-fasoli requested review from BrandonGroth, chichun-charlie-liu, kcirred, nwang-ibm and tharapalanivel as code owners October 14, 2025 21:02

github-actions bot added the fix label Oct 14, 2025

chichun-charlie-liu reviewed Oct 15, 2025

View reviewed changes

fms_mo/aiu_addons/fp8/fp8_spyre_op.py Show resolved Hide resolved

andrea-fasoli added 2 commits October 15, 2025 17:39

Re-enable custom op for pt<=2.7

f61c561

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

Clean up versioning for int8 aiu op

1c4f22e

Signed-off-by: Andrea Fasoli <andrea.fasoli@ibm.com>

chichun-charlie-liu approved these changes Oct 17, 2025

View reviewed changes

chichun-charlie-liu merged commit 0f76a0d into main Oct 17, 2025
14 checks passed

chichun-charlie-liu deleted the scaled_mm_out branch October 17, 2025 20:40

chichun-charlie-liu mentioned this pull request Oct 17, 2025

chore(deps): Update torch requirement from <2.8,>=2.2.0 to >=2.2.0,<2.9 #177

Merged

andrea-fasoli mentioned this pull request Nov 11, 2025

fix: Fixes for paged fp8 attention with chunked prefill #191

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: remove custom scaled bmm op on cpu and fix fp8 test#187

fix: remove custom scaled bmm op on cpu and fix fp8 test#187
chichun-charlie-liu merged 3 commits intomainfrom
scaled_mm_out

andrea-fasoli commented Oct 14, 2025

Uh oh!

andrea-fasoli commented Oct 15, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

andrea-fasoli commented Oct 14, 2025

Description of the change

Related issues or PRs

Uh oh!

andrea-fasoli commented Oct 15, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants