Skip to content

onnx: handle com.microsoft RotaryEmbedding contrib op#2284

Open
czoli1976 wants to merge 2 commits into
sonos:mainfrom
czoli1976:feature/onnx-rope-microsoft
Open

onnx: handle com.microsoft RotaryEmbedding contrib op#2284
czoli1976 wants to merge 2 commits into
sonos:mainfrom
czoli1976:feature/onnx-rope-microsoft

Conversation

@czoli1976
Copy link
Copy Markdown
Contributor

Summary

Adds support for the com.microsoft.RotaryEmbedding contrib op (emitted by ONNX Runtime / GenAI / Olive LLM exports). It is identical math to the standardized ai.onnx.RotaryEmbedding but orders its inputs (input, position_ids, cos, sin). Since tract resolves operators by name regardless of domain, the existing RotaryEmbedding handler is made domain-aware and remaps inputs accordingly. The contrib-only scale != 1.0 and is_packed_batching attributes are rejected with clear errors.

Validation

  • Bit-exact vs onnxruntime on 3D (+num_heads), 4D, and interleaved models.
  • ai.onnx RotaryEmbedding conformance unchanged (32/32, both runtimes).
  • Empirically confirmed com.microsoft == ai.onnx math + input reorder (ORT vs ONNX ReferenceEvaluator, bit-exact for default and interleaved).

Note — stacked on #2283

This branches off #2283 (the ai.onnx op handlers), which introduces rotary_embedding.rs. The only new commit here is the com.microsoft handler; the first commit is shared with #2283 and will drop out of this diff once #2283 merges. Happy to rebase onto main after #2283 lands.

czoli1976 and others added 2 commits May 26, 2026 09:16
…ion, RotaryEmbedding op handlers

Import handlers for four standardized ai.onnx operators, lowering to
existing tract primitives:
- LpNormalization (opset 1), MeanVarianceNormalization (opset 13)
- GroupNormalization (opset 18 & 21; opset-aware affine, f32 stash_type)
- RotaryEmbedding (opset 23; 3D/4D input, position_ids, partial + interleaved)

Corresponding ONNX backend node tests enabled in suite-onnx/node.txt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
com.microsoft.RotaryEmbedding is identical math to the standardized
ai.onnx op but orders its inputs (input, position_ids, cos, sin). tract
resolves ops by name regardless of domain, so make the single handler
domain-aware and remap inputs accordingly. Rejects the contrib-only
scale != 1.0 and is_packed_batching attributes with clear errors.

Verified bit-exact against onnxruntime (3D, 4D, interleaved); ai.onnx
RotaryEmbedding conformance unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants