Skip to content

Fix Mamba2 step() D handling when D_has_hdim=True#893

Open
ipnon wants to merge 1 commit intostate-spaces:mainfrom
ipnon:fix/mamba2-step-d-has-hdim
Open

Fix Mamba2 step() D handling when D_has_hdim=True#893
ipnon wants to merge 1 commit intostate-spaces:mainfrom
ipnon:fix/mamba2-step-d-has-hdim

Conversation

@ipnon
Copy link
Copy Markdown

@ipnon ipnon commented Apr 2, 2026

Summary

  • step() treats self.D as shape (nheads,) in both code paths, but when D_has_hdim=True it's (nheads * headdim,) — producing silent wrong outputs during inference
  • forward() already handles this correctly via rearrange("(h p) -> h p", p=self.headdim) at line 191
  • Adds self.D_has_hdim conditional at both step() paths (lines 319 and 327) to match forward()

Test plan

  • New parametrized test test_mamba2_step_forward_consistency checks forward/step output consistency for both D_has_hdim=True and D_has_hdim=False
  • Uses randomized (non-uniform) D values so the bug isn't masked by all-ones initialization
  • Both cases pass on T4 GPU (max diff ~5e-08)

Fixes #887, fixes #888.

When D_has_hdim=True, self.D has shape (nheads * headdim,) but step()
treated it as (nheads,) in both code paths, producing silent wrong
outputs during inference. forward() already handled this correctly
via rearrange("(h p) -> h p").

Add conditional reshape in both step() paths to match forward().
Add regression test comparing forward/step consistency.

Fixes state-spaces#887, fixes state-spaces#888.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mamba2 step() works silently but when D_has_hdim=True and selective_state_update=None but is Mamba2.step() handles D incorrectly when D_has_dim=True

1 participant