refactor(ir): unify memory space requirements into OpConversionRegistry by Hzfengsy · Pull Request #974 · hw-native-sys/pypto

Hzfengsy · 2026-04-11T15:55:19Z

Summary

Add InputSpaceReq and ConversionEntry to OpConversionRegistry, allowing converters to declare per-input memory space requirements as metadata
Replace the special-purpose MatmulSlicePatternCollector with a general ConsumerSpaceCollector driven by registered converter metadata
Add framework auto-bridging in TensorToTileMutator that automatically loads TensorType args to the required memory space before calling converters
Simplify matmul/matmul_acc converters from imperative load+compute to pure compute-op emitters
Remove LoadOperandToMat helper (no longer needed)

Motivation

Memory space handling was spread across three independent mechanisms (Phase-1 entry loads, MatmulSlicePatternCollector, LoadOperandToMat) with a hardcoded kSelfLoadingOps exclusion list. Adding any new op that needs non-Vec inputs would require a new special-case collector. This refactoring unifies the approach: converters declare what they need, and the framework handles the rest.

Closes #972

Testing

All 3454 unit tests pass
All 51 test_convert_tensor_to_tile_ops tests pass (including slice→matmul patterns)
clang-tidy clean
Code review completed

Move per-input memory space declarations into OpConversionRegistry via InputSpaceReq, replacing the special-purpose MatmulSlicePatternCollector with a general ConsumerSpaceCollector driven by registered metadata. The framework now auto-bridges TensorType args to the required memory space before calling converters, eliminating LoadOperandToMat and simplifying matmul/matmul_acc converters to pure compute-op emitters.

coderabbitai · 2026-04-11T15:55:34Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 007da75c-fa30-450e-9509-22978b989bb7

📥 Commits

Reviewing files that changed from the base of the PR and between c7ea698 and 96dfebb.

📒 Files selected for processing (2)

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp
src/ir/transforms/op_conversion_registry.cpp

🚧 Files skipped from review as they are similar to previous changes (2)

src/ir/transforms/op_conversion_registry.cpp
src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp

📝 Walkthrough

Walkthrough

Extended the op conversion registry with per-input memory-space requirements (InputSpaceReq, ConversionEntry) and updated the tensor→tile conversion pass to pre-scan consumer space needs and automatically bridge argument memory spaces via synthetic tile.load/tile.move before invoking converters.

Changes

Cohort / File(s)	Summary
Registry Types & APIs `include/pypto/ir/transforms/op_conversion_registry.h`, `src/ir/transforms/op_conversion_registry.cpp`	Added `InputSpaceReq` (MemorySpace + optional transpose kwarg) and `ConversionEntry` (ConversionFunc + input_reqs); extended `RegisterSimple`/`RegisterCustom` to accept `input_reqs`; `Lookup` now returns `const ConversionEntry*`; stored conversions map updated; removed `LoadOperandToMat`.
Conversion Pass & Mutator `src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp`	Replaced matmul-specific pre-scan with `ConsumerSpaceCollector`; collect per-variable required MemorySpace/transpose from registry metadata; added `BridgeInputSpaces` to synthesize `tile.load`/`tile.move` prologue for required input spaces and rewrite call args; `TensorToTileMutator` uses bridged args and merges bridge prologue + converter prologue; removed matmul-only slice logic.
Op Registration Changes `src/ir/transforms/op_conversion_registry.cpp`	Updated `tensor.matmul` and `tensor.matmul_acc` registrations to declare per-input `input_reqs` (Mat space + transpose kwarg names) and to emit pure tile.* compute ops without embedded load prologues.
Headers / Includes `include/pypto/ir/transforms/op_conversion_registry.h`, `src/ir/transforms/...`	Added includes: `<cstddef>`, `<optional>`, `<unordered_map>` and `pypto/ir/memory_space.h` to support new types and maps.

Sequence Diagram(s)

sequenceDiagram
    participant Pass as Conversion Pass
    participant Registry as OpConversionRegistry
    participant Collector as ConsumerSpaceCollector
    participant Mutator as TensorToTileMutator
    participant Bridge as BridgeInputSpaces
    participant Converter as Op Converter

    Pass->>Collector: Pre-scan IR for consumer needs
    Collector->>Registry: Lookup ConversionEntry (func + input_reqs)
    Registry-->>Collector: Return input_reqs per-op
    Collector-->>Pass: Return var→(MemorySpace, transpose) map

    Pass->>Mutator: Start mutating function (with consumer map)

    Note over Mutator: For each converted call
    Mutator->>Bridge: BridgeInputSpaces(call args, input_reqs)
    Bridge->>Bridge: For each arg: check actual vs required space
    Bridge->>Bridge: Emit synthetic `tile.load`/`tile.move` as needed
    Bridge-->>Mutator: Return bridged args + bridge prologue

    Mutator->>Converter: Invoke converter func with bridged args
    Converter->>Converter: Emit tile.* compute call (no loads)
    Converter-->>Mutator: Return result + converter prologue

    Mutator->>Mutator: Merge bridge prologue + converter prologue + result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

[Code Health] Refactor ConvertTensorToTileOps: eliminate duplication, split oversized file, and unify matmul-slice path #890: Removes matmul-specific LoadOperandToMat and adds per-input InputSpaceReq/ConversionEntry plus ConsumerSpaceCollector—objectives align with this diff.
[RFC] Unify memory space requirements into OpConversionRegistry #972: RFC to unify memory-space requirements into OpConversionRegistry; this PR implements those RFC objectives (per-input requirements, ConversionEntry, ConsumerSpaceCollector, bridging).

Possibly related PRs

refactor(pass): replace hand-rolled recursion with IRMutator in ConvertTensorToTileOps #834: Refactors convert_tensor_to_tile_ops_pass and pre-scan logic; closely related to the collector/bridge changes here.
feat(ir): add tensor.matmul_acc op with unified dispatch #702: Modifies tensor.matmul/matmul_acc conversion and registry usage; overlaps with the updated op registrations.
fix(pass): store iter-arg returns to InOut params in ConvertTensorToTileOps #783: Alters TransformIncoreFunction and pass flow in convert_tensor_to_tile_ops_pass; related to mutator and pre-scan changes.

Suggested reviewers

lyfne123

Poem

🐰 I hop through code with careful paws,

Bridging spaces, obeying laws.
No more special-case, no tangled mat;
Inputs aligned — a tidy format.
A registry hums, converters sing, hop! 🌿

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change: unifying memory space requirements into OpConversionRegistry, which is the central refactoring theme across all modified files.
Description check	✅ Passed	The description provides relevant context about the changes, including motivation for the refactoring, the main components added/modified, and testing outcomes; it clearly relates to the changeset.
Linked Issues check	✅ Passed	All core requirements from `#972` are met: InputSpaceReq and ConversionEntry types added, OpConversionRegistry extended with input_reqs, ConsumerSpaceCollector implemented, auto-bridging in TensorToTileMutator, matmul converters simplified, and special-case mechanisms removed.
Out of Scope Changes check	✅ Passed	All changes directly support the unification objective: new registry types, converter simplifications, collector replacement, and bridging logic; no extraneous modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR refactors convert_tensor_to_tile_ops to centralize per-input memory space requirements in OpConversionRegistry, enabling framework-driven memory space bridging and replacing the special-cased tensor.slice → tensor.matmul look-ahead with a metadata-driven collector.

Changes:

Extend OpConversionRegistry to store converter functions alongside per-input InputSpaceReq metadata.
Replace MatmulSlicePatternCollector with a general ConsumerSpaceCollector that reads input space requirements from the registry.
Add call-site auto-bridging in TensorToTileMutator (currently implemented as TensorType tile.load insertion) and simplify matmul converters to pure compute emitters.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`include/pypto/ir/transforms/op_conversion_registry.h`	Introduces `InputSpaceReq`/`ConversionEntry` and updates registration/lookup APIs.
`src/ir/transforms/op_conversion_registry.cpp`	Migrates registry storage to `ConversionEntry` and registers matmul input space requirements.
`src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp`	Adds consumer-space pre-scan, wires registry entries into conversion, and implements input bridging + consumer-driven slice load overrides.

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp

src/ir/transforms/op_conversion_registry.cpp

gemini-code-assist

Code Review

This pull request generalizes the mechanism for handling memory space requirements during tensor-to-tile conversion. It replaces the specialized MatmulSlicePatternCollector with a metadata-driven ConsumerSpaceCollector that utilizes InputSpaceReq definitions in the OpConversionRegistry. Key additions include the BridgeInputSpaces utility for automatic tile.load insertion and updates to tensor.matmul and tensor.matmul_acc to leverage this framework. Review feedback highlights a potential issue where skipping operations with any input requirements might lead to type mismatches for arguments without requirements, such as accumulators. Additionally, it is suggested to refine the 'first consumer wins' strategy to better handle conflicting memory space requirements by prioritizing specialized spaces over the default.

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp

- Add GlobalVar guard in ConsumerSpaceCollector for consistency with TensorArgsInConvertedOpsCollector and TensorToTileMutator. - Prioritize non-Vec memory spaces in ConsumerSpaceCollector so a Vec requirement never shadows a later Mat/Left/Right/Acc/Bias requirement. - Update stale comment above tensor.matmul / tensor.matmul_acc converters to reflect the new framework auto-bridging responsibility split. - Phase-1 collector now excludes args per-index (not whole call) so matmul_acc's acc arg still receives a Phase-1 load when needed.

Copilot AI review requested due to automatic review settings April 11, 2026 15:55

github-project-automation bot added this to pto project Apr 11, 2026

Copilot started reviewing on behalf of Hzfengsy April 11, 2026 15:55 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp Show resolved Hide resolved

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp Show resolved Hide resolved

src/ir/transforms/op_conversion_registry.cpp Show resolved Hide resolved

gemini-code-assist bot reviewed Apr 11, 2026

View reviewed changes

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp Outdated Show resolved Hide resolved

src/ir/transforms/convert_tensor_to_tile_ops_pass.cpp Outdated Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(ir): unify memory space requirements into OpConversionRegistry#974

refactor(ir): unify memory space requirements into OpConversionRegistry#974
Hzfengsy wants to merge 2 commits intohw-native-sys:mainfrom
Hzfengsy:refactor/unify-input-space-reqs

Hzfengsy commented Apr 11, 2026

Uh oh!

coderabbitai bot commented Apr 11, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hzfengsy commented Apr 11, 2026

Summary

Motivation

Testing

Uh oh!

coderabbitai bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Apr 11, 2026 •

edited

Loading