Skip to content

fix: preserve compatibility with old-style A2A3 TPipe local slot arg#68

Open
ndleslx wants to merge 1 commit intohw-native-sys:mainfrom
ndleslx:fix/a2a3-tpipe-compat
Open

fix: preserve compatibility with old-style A2A3 TPipe local slot arg#68
ndleslx wants to merge 1 commit intohw-native-sys:mainfrom
ndleslx:fix/a2a3-tpipe-compat

Conversation

@ndleslx
Copy link
Copy Markdown

@ndleslx ndleslx commented Apr 12, 2026

Summary

  • make include/pto/npu/a2a3/TPush.hpp accept both old-style and new-style TPipe template argument layouts
  • interpret the 5th template argument as IsNoSplit only for 0/1, otherwise treat it as LocalSlotNum
  • keep current behavior for new call sites while restoring compatibility with current ptoas v0.24 output

Problem

Current generated code from ptoas v0.24 may emit:

TPipe<0, Direction::DIR_C2V, 4096, 8, 8>

but the current A2A3 header declares the 5th template parameter as bool IsNoSplit, which makes that form fail to compile.

Fix

Use a compatibility parameter in the 5th position and derive:

  • IsNoSplit from 0/1
  • the effective local slot count from values >1

This preserves support for both:

  • TPipe<..., SlotNum, LocalSlotNum>
  • TPipe<..., SlotNum, IsNoSplit, LocalSlotNum>

Validation

  • confirmed hw-native-sys/ptoas latest release is v0.24
  • confirmed current hw-native-sys/pto-isa:main still had the incompatible A2A3 signature
  • verified the patch matches the downstream compatibility workaround that unblocked ptoas v0.24 generated A2A3 kernels

Closes #67

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the TPipe template parameters in TPush.hpp to introduce CompatLocalSlotNumOrIsNoSplit, replacing the IsNoSplit boolean. This update allows for more flexible local slot number configurations while maintaining backward compatibility. Feedback was provided to simplify the boolean logic for the IsNoSplit constant to improve code clarity.

Comment on lines +38 to +39
static constexpr bool IsNoSplit =
CompatLocalSlotNumOrIsNoSplit <= 1 && static_cast<bool>(CompatLocalSlotNumOrIsNoSplit);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for IsNoSplit can be simplified. Since CompatLocalSlotNumOrIsNoSplit is a uint32_t, the expression CompatLocalSlotNumOrIsNoSplit <= 1 && static_cast<bool>(CompatLocalSlotNumOrIsNoSplit) is logically equivalent to CompatLocalSlotNumOrIsNoSplit == 1. This is more concise and improves readability while maintaining the same behavior (evaluating to true only when the value is 1).

    static constexpr bool IsNoSplit = (CompatLocalSlotNumOrIsNoSplit == 1);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A2A3 TPipe template is incompatible with ptoas 0.24 old-style LocalSlotNum emission

1 participant