feat(ir,dsl): add pl.runtime_print for runtime tile/tensor debugging by Hzfengsy · Pull Request #857 · hw-native-sys/pypto

Hzfengsy · 2026-04-02T10:56:19Z

Summary

Add pl.runtime_print(tile_or_tensor) DSL function that lowers to pto.tprint, enabling runtime debugging of tile and tensor contents on device
Supports both tiles (pl.runtime_print(tile) / pl.tile.runtime_print(tile)) and tensors (pl.runtime_print(tensor) / pl.tensor.runtime_print(tensor)) via unified dispatch
Register tile.runtime_print and tensor.runtime_print C++ IR ops with pass-through type deduction
Register tensor-to-tile conversion so tensor.runtime_print lowers correctly in InCore scope
Add codegen mapping to pto.tprint for both ops

Test plan

10 unit tests: parser, roundtrip, type preservation, namespace access, error cases
2 system tests: tile print and tensor print with PTOTestCase harness
Full test suite: 3316 passed, 0 failed
clang-tidy: clean
pyright: clean

Closes #846

…w-native-sys#846) Add runtime_print DSL function that lowers to pto.tprint, enabling users to print tile and tensor contents for on-device debugging. - Register tile.runtime_print and tensor.runtime_print C++ IR ops - Add Python IR, DSL, and unified dispatch layers - Register tensor-to-tile conversion for InCore scope lowering - Add codegen mapping to pto.tprint for both ops - Add unit tests (10) and system tests (2) Closes hw-native-sys#846

coderabbitai · 2026-04-02T10:56:35Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

The PR adds a runtime-print debugging feature across the pipeline: new IR ops tile.runtime_print and tensor.runtime_print (C++), Python IR wrappers and DSL helpers (pl.runtime_print), conversion lowering from tensor→tile, backend mapping to the existing print codegen, build updates, and unit/system tests.

Changes

Cohort / File(s)	Summary
Build Configuration `CMakeLists.txt`	Added `src/ir/op/tile_ops/utility.cpp` and `src/ir/op/tensor_ops/utility.cpp` to `PYTO_SOURCES`.
IR Op Implementations `src/ir/op/tile_ops/utility.cpp`, `src/ir/op/tensor_ops/utility.cpp`	New IR ops `tile.runtime_print` and `tensor.runtime_print` with type-deduction helpers that validate single-arg and return the input Tile/Tensor type (pass-through).
IR → Tile Conversion `src/ir/transforms/op_conversion_registry.cpp`	Added converter for `tensor.runtime_print`: if input is TileType, call `tile.runtime_print`; if TensorType, emit a `tile.load` prologue then `tile.runtime_print`.
Backend Mapping `src/backend/common/pto_ops_common.cpp`	Replaced former `tile.print` registration with registrations mapping `tile.runtime_print` and `tensor.runtime_print` to the shared print codegen factory (`pto.tprint`).
Python IR Wrappers `python/pypto/ir/op/tile_ops.py`, `python/pypto/ir/op/tensor_ops.py`	Added `runtime_print(expr, span=None) -> Call` helpers that emit `tile.runtime_print` / `tensor.runtime_print` IR calls.
DSL (Type-Specific) `python/pypto/language/op/tile_ops.py`, `python/pypto/language/op/tensor_ops.py`	Added `runtime_print(tile/tensor) -> None` DSL statement helpers (unwrap and forward to IR ops); updated `__all__`.
DSL (Unified) & Public API `python/pypto/language/op/unified_ops.py`, `python/pypto/language/__init__.py`	Added unified `runtime_print(src: Tensor
Unit Tests `tests/ut/language/parser/test_runtime_print.py`	New unit tests verifying IR emission, call operator names/types, statement insertion, round-trip source parsing, and error on scalar inputs.
System Tests `tests/st/runtime/test_runtime_print.py`	End-to-end tests for tile and tensor usage ensuring runtime_print side-effects do not change numeric results on 128×128 FP32 inputs.

Sequence Diagram

sequenceDiagram
    participant DSL as DSL User\n(pl.runtime_print)
    participant PyIR as Python IR Ops\n(_ir_ops.runtime_print)
    participant IRReg as IR Registration\n(REGISTER_OP)
    participant TypeConv as Type Conversion\n(OpConversionRegistry)
    participant Backend as Backend\n(pto_ops_common)
    participant PTO as PTO\n(pto.tprint)

    DSL->>PyIR: call runtime_print(src)
    PyIR->>IRReg: create Call to\ntensor.runtime_print or tile.runtime_print
    IRReg->>IRReg: deduce type:\nvalidate arg type, return pass-through
    TypeConv->>TypeConv: lower tensor.runtime_print\n→ tile.runtime_print (may insert tile.load)
    Backend->>PTO: map tile.runtime_print\n→ emit pto.tprint

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly Related PRs

feat(ir): Add tensor-to-block op conversions with broadcast and matmul support #387: Modifies the OpConversionRegistry in ways closely related to this PR's tensor.runtime_print → tile.runtime_print conversion registration.

Poem

🐰 I hopped through code to add a print delight,

Tiles and tensors now speak in the night,
From DSL to PTO they follow the trail,
A rabbit’s debug hop — no detail will fail!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 51.02% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title accurately describes the main change: adding a runtime_print function for debugging tiles/tensors via pl.runtime_print.
Description check	✅ Passed	The description provides relevant context about the feature addition, implementation approach, and testing, all related to the changeset.
Linked Issues check	✅ Passed	The PR fully implements the requirements from issue `#846`: adds a DSL-level print helper (pl.runtime_print instead of pl.print) that lowers to pto.tprint for tile/tensor debugging.
Out of Scope Changes check	✅ Passed	All changes directly support the objective of adding pl.runtime_print for runtime debugging; no unrelated modifications are present.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a runtime_print utility for both tensors and tiles to facilitate debugging by emitting pto.tprint instructions. The implementation spans the C++ IR, Python bindings, and unified language operators, including support for type deduction and IR conversion. Comprehensive unit and runtime tests were added to verify the new functionality. Feedback was provided to improve test precision by catching a specific TypeError instead of a generic Exception in the unit tests.

tests/ut/language/parser/test_runtime_print.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/ir/op/tensor_ops/utility.cpp (1)
44-51: Consider adding .no_memory_spec() for consistency with tile.runtime_print.

The tile.runtime_print registration in src/ir/op/tile_ops/utility.cpp (line 48) includes .no_memory_spec(), but this tensor counterpart omits it. Since tensor.runtime_print is similarly a pure side-effect debugging operation with no memory specification requirements, adding it would maintain consistency.
♻️ Proposed fix
 REGISTER_OP("tensor.runtime_print")
     .set_op_category("TensorOp")
     .set_description("Print tensor contents for debugging (generates pto.tprint)")
     .add_argument("tensor", "Input tensor to print (TensorType)")
+    .no_memory_spec()
     .f_deduce_type([](const std::vector<ExprPtr>& args,
                       const std::vector<std::pair<std::string, std::any>>& kwargs) {
       return DeduceTensorPrintType(args, kwargs, "tensor.runtime_print");
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/ir/op/tensor_ops/utility.cpp` around lines 44 - 51, The
tensor.runtime_print op registration is missing .no_memory_spec(), making it
inconsistent with tile.runtime_print; update the
REGISTER_OP("tensor.runtime_print") chain to include .no_memory_spec()
(alongside set_op_category, set_description, add_argument, and f_deduce_type) so
the debug-only op declares no memory specification requirement—keep
DeduceTensorPrintType(...) and the existing f_deduce_type call unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/ut/language/parser/test_runtime_print.py`:
- Around line 185-193: Update the test
test_runtime_print_requires_tile_or_tensor to assert the specific exception
TypeError rather than a bare Exception: replace pytest.raises(Exception) with
pytest.raises(TypeError) so the test verifies that pl.runtime_print(x) (in the
function defined inside the test) raises TypeError for non-Tensor/Tile inputs;
keep the same test body and references to pl.runtime_print and the inner
function to locate the change.

---

Nitpick comments:
In `@src/ir/op/tensor_ops/utility.cpp`:
- Around line 44-51: The tensor.runtime_print op registration is missing
.no_memory_spec(), making it inconsistent with tile.runtime_print; update the
REGISTER_OP("tensor.runtime_print") chain to include .no_memory_spec()
(alongside set_op_category, set_description, add_argument, and f_deduce_type) so
the debug-only op declares no memory specification requirement—keep
DeduceTensorPrintType(...) and the existing f_deduce_type call unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8e256409-ad0b-4377-be78-07056253e042

📥 Commits

Reviewing files that changed from the base of the PR and between d765fc0 and 98afb47.

📒 Files selected for processing (13)

CMakeLists.txt
python/pypto/ir/op/tensor_ops.py
python/pypto/ir/op/tile_ops.py
python/pypto/language/__init__.py
python/pypto/language/op/tensor_ops.py
python/pypto/language/op/tile_ops.py
python/pypto/language/op/unified_ops.py
src/backend/common/pto_ops_common.cpp
src/ir/op/tensor_ops/utility.cpp
src/ir/op/tile_ops/utility.cpp
src/ir/transforms/op_conversion_registry.cpp
tests/st/runtime/test_runtime_print.py
tests/ut/language/parser/test_runtime_print.py

tests/ut/language/parser/test_runtime_print.py

Copilot

Pull request overview

Adds a new debugging utility pl.runtime_print(tile_or_tensor) to the PyPTO DSL that lowers to pto.tprint, enabling runtime printing of tile/tensor contents without affecting program results.

Changes:

Introduce new IR ops tile.runtime_print and tensor.runtime_print with pass-through type deduction.
Add DSL APIs for unified dispatch (pl.runtime_print) plus explicit namespaces (pl.tile.runtime_print, pl.tensor.runtime_print).
Add backend codegen mappings to emit pto.tprint, plus new unit + system tests.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/ut/language/parser/test_runtime_print.py	Parser/unit coverage for IR shape, printing roundtrip, and basic error handling.
tests/st/runtime/test_runtime_print.py	End-to-end runtime coverage to ensure `pto.tprint` emission doesn’t change results.
src/ir/transforms/op_conversion_registry.cpp	Adds tensor→tile op conversion entry for `tensor.runtime_print`.
src/ir/op/tile_ops/utility.cpp	Registers `tile.runtime_print` IR op and type deduction.
src/ir/op/tensor_ops/utility.cpp	Registers `tensor.runtime_print` IR op and type deduction.
src/backend/common/pto_ops_common.cpp	Maps the new ops to `pto.tprint` codegen.
python/pypto/language/op/unified_ops.py	Adds unified `pl.runtime_print` dispatch (Tensor vs Tile).
python/pypto/language/op/tile_ops.py	Adds `pl.tile.runtime_print`.
python/pypto/language/op/tensor_ops.py	Adds `pl.tensor.runtime_print`.
python/pypto/language/init.py	Re-exports `runtime_print` at `pypto.language` top level.
python/pypto/ir/op/tile_ops.py	Adds IR builder helper for `tile.runtime_print`.
python/pypto/ir/op/tensor_ops.py	Adds IR builder helper for `tensor.runtime_print`.
CMakeLists.txt	Includes the new C++ op source files in the build.

src/ir/transforms/op_conversion_registry.cpp

tests/ut/language/parser/test_runtime_print.py

- Replace RegisterSimple with RegisterCustom for tensor.runtime_print conversion: inserts tile.load prologue when the argument is still a TensorType (e.g. printing a function parameter before any explicit tile.load), matching the tensor.fillpad pattern - Use InvalidOperationError instead of bare Exception in the error-case test for precision

MakePrintCodegenPTO was emitting a placeholder type with wrong separator ('|' instead of ':') and a dummy type string instead of the actual tile buffer type from GetExprTypeAnnotation.

pto-isa guards TPRINT behind #ifdef _DEBUG. When ptoas-generated code contains TPRINT (from pto.tprint / runtime_print), insert #define _DEBUG before the pto-inst.hpp include so the macro is available.

_DEBUG enables cce::printf calls across all pto-isa headers, which don't compile in simulation. Instead, inject a no-op TPRINT template after the include, guarded by #ifndef _DEBUG so the real implementation is used on hardware.

Copilot AI review requested due to automatic review settings April 2, 2026 10:56

Copilot started reviewing on behalf of Hzfengsy April 2, 2026 10:56 View session

gemini-code-assist bot reviewed Apr 2, 2026

View reviewed changes

tests/ut/language/parser/test_runtime_print.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Apr 2, 2026

View reviewed changes

tests/ut/language/parser/test_runtime_print.py Show resolved Hide resolved

Copilot AI reviewed Apr 2, 2026

View reviewed changes

src/ir/transforms/op_conversion_registry.cpp Outdated Show resolved Hide resolved

tests/ut/language/parser/test_runtime_print.py Show resolved Hide resolved

Hzfengsy added 4 commits April 3, 2026 16:23

fix(codegen): use proper type annotation in pto.tprint codegen

ab24de6

MakePrintCodegenPTO was emitting a placeholder type with wrong separator ('|' instead of ':') and a dummy type string instead of the actual tile buffer type from GetExprTypeAnnotation.

fix(codegen): define _DEBUG in kernel header when TPRINT is used

b1a954c

pto-isa guards TPRINT behind #ifdef _DEBUG. When ptoas-generated code contains TPRINT (from pto.tprint / runtime_print), insert #define _DEBUG before the pto-inst.hpp include so the macro is available.

Hzfengsy marked this pull request as draft April 6, 2026 06:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ir,dsl): add pl.runtime_print for runtime tile/tensor debugging#857

feat(ir,dsl): add pl.runtime_print for runtime tile/tensor debugging#857
Hzfengsy wants to merge 5 commits intohw-native-sys:mainfrom
Hzfengsy:feat/runtime-print

Hzfengsy commented Apr 2, 2026

Uh oh!

coderabbitai bot commented Apr 2, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Hzfengsy commented Apr 2, 2026

Summary

Test plan

Uh oh!

coderabbitai bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated Code Review Effort

Possibly Related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Apr 2, 2026 •

edited

Loading