add qwen3 scope3 tile helper example#94
add qwen3 scope3 tile helper example#94high-cloud wants to merge 1 commit intohw-native-sys:mainfrom
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
What changed
examples/models/qwen3/qwen3_32b_decode_scope3_tile.pyas a tile/helper-oriented Scope 3 example for Qwen3 decode.Why
The existing
qwen3_32b_decode_scope3.pyfile is useful for baseline scope validation, but the optimized helper version is easier to review when inspecting tile-level structure and fused InCore boundaries.Impact
Performance
The helper structure in
qwen3_32b_decode_scope3_tile.pycomes from the same Scope 3 fusion work that was profiled onexamples/models/qwen3/qwen3_32b_decode_scope3.pywith runtime profiling enabled.Measured with:
python examples/models/qwen3/qwen3_32b_decode_scope3.py -p a2a3 -d 7 --runtime-profilingRepresentative results:
3295.18 us3038.68 us(-7.8%vs baseline)2822.68 us(-14.3%vs baseline,-7.1%vs gate-only fused)Validation
python examples/models/qwen3/qwen3_32b_decode_scope3_tile.py -p a2a3 -d 7python examples/models/qwen3/qwen3_32b_decode_scope3.py -p a2a3 -d 7 --runtime-profiling