Skip to content

TStore verifier should reject tile/partition shape mismatch and subset should not allow enlarging tile shape #322

@Zhendong404

Description

@Zhendong404

Summary

pto.tstore currently accepts IR where the destination partition shape is larger than the source tile valid shape.

This later lowers to PTO-ISA C++ like:

GlobalTensor<float, ..., Shape<..., 64, 64>, ...> dst = ...;
Tile<..., 64, 64, ..., 32, 32, ...> tile = ...;
TSTORE(dst, tile);

For PTO-ISA there is an implicit constraint: during TSTORE, the GlobalTensor shape must match the tile valid shape. The current verifier does not enforce this, so invalid IR can pass verification and only show up much later in generated code.

Reproducer

A generated case from this repo already demonstrates the problem:

  • source test case:
    test/pto_isa_st/TMaxs/tmaxs_float_64x64_32x32_32x32.py
  • generated PTO IR:
    build/output/TMaxs/tmaxs_float_64x64_32x32_32x32-pto-ir.pto
  • generated kernel:
    build/output_npu_validation/TMaxs/tmaxs_float_64x64_32x32_32x32/tmaxs_float_64x64_32x32_32x32_kernel.cpp

Relevant PTO IR:

%4 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
%5 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>
%7 = pto.subset %4[%c0_5, %c0_5] sizes [64, 64] : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
pto.tmaxs ins(%7, %6 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>, f32)
  outs(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
pto.tstore ins(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
  outs(%3 : !pto.partition_tensor_view<64x64xf32>)

This verifies today, but should be rejected earlier.

Root cause

There seem to be two gaps:

  1. TStoreOp::verify() checks element types / address spaces, but does not check that destination partition shape matches the source tile valid shape.
  2. SubsetOp::verify() does not reject non-boxed subsets that enlarge the tile shape, so a 32x32 tile can become a subset ... sizes [64, 64].

Because of that, invalid IR is accepted and codegen simply materializes the mismatch.

Expected behavior

At least one of the following should be enforced:

  1. pto.tstore verifier should reject cases where:
    • dst partition rank/shape does not match src tile valid shape
    • especially for static cases, dst shape != src valid_shape
  2. pto.subset verifier should reject shape enlargement:
    • subset result sizes must not exceed source tile shape
    • ideally also remain consistent with valid-shape semantics

Why this matters

Without this check, invalid test cases or frontend bugs can silently generate PTO IR that looks structurally valid but violates PTO-ISA constraints at TSTORE lowering time.

Suggested fix

  • Add a shape compatibility check in TStoreOp::verify()
  • Add a no-enlargement check in SubsetOp::verify()
  • Optionally add a regression test using the existing tmaxs_float_64x64_32x32_32x32 pattern

Local reference points

  • lib/PTO/IR/PTO.cpp:
    • TStoreOp::verify()
    • SubsetOp::verify()
  • test/sample_utils/pto_isa_st_cases.py:
    • _subset_if_needed() currently builds pto.subset(... sizes=[dst_shape]) whenever src_shape != dst_shape, even when dst_shape > src_shape

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions