Fix: rename enable_profiling to runtime_profiling in RunConfig#91
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (21)
🚧 Files skipped from review as they are similar to previous changes (18)
📝 WalkthroughWalkthroughRename profiling option across example scripts: function parameter Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request renames the "enable_profiling" parameter to "runtime_profiling" within "RunConfig" across multiple example scripts. The reviewer recommends renaming the associated local variables and function parameters to "runtime_profiling" as well to maintain naming consistency throughout the modified files.
examples/beginner/hello_world.py
Outdated
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
examples/beginner/matmul.py
Outdated
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
examples/intermediate/gemm.py
Outdated
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
examples/intermediate/layer_norm.py
Outdated
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
examples/intermediate/rms_norm.py
Outdated
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
| dump_passes=dump_passes, | ||
| backend_type=backend, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
| dump_passes=dump_passes, | ||
| backend_type=BackendType.Ascend950, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
| dump_passes=dump_passes, | ||
| backend_type=BackendType.Ascend950, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
| dump_passes=dump_passes, | ||
| backend_type=BackendType.Ascend950, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
| backend_type=BackendType.CCE, | ||
| work_dir=work_dir, | ||
| enable_profiling=enable_profiling, | ||
| runtime_profiling=enable_profiling, |
There was a problem hiding this comment.
The pypto runtime API renamed RunConfig.enable_profiling to RunConfig.runtime_profiling. Update all 21 example files to match.
453c80b to
23f78cf
Compare
…ons with FP32 projection output Scope 1 (prefill_scope1.py): - Scale to full Qwen3-32B: NUM_HEADS=64, MAX_SEQ=4096, HIDDEN=8192 - Change Q/K/V projection output from BF16 to FP32, aligning with decode_scope1 so scope 2 receives FP32 inputs without extra round-trip - Remove intermediate FP32 buffer + BF16 cast; assemble matmul accumulator directly to output tensor - Increase TOK_TILE from 16 to 64 for fewer task blocks - Simplify golden: full-sequence vectorization, FP32 output - Tighten tolerance from 2e-3 to 1e-3 Scope 1 wIOBuffer (prefill_scope1_wIOBuffer.py) [NEW]: - I/O buffer caching wrapper for scope 1 via io_cache plugin - wrap_golden_multi() helper for multi-output golden caching Scope 3 (prefill_scope3.py): - Scale to full Qwen3-32B: HIDDEN=8192, INTERMEDIATE=25600, MAX_SEQ=4096 - Increase TOK_TILE to 64, MLP_OUT_CHUNK to 128 - Simplify golden: full-sequence vectorization, pre-convert weights to FP32 - Clean up stage numbering and comments Scope 3 wIOBuffer (prefill_scope3_wIOBuffer.py): - Fix parameter name: enable_profiling -> runtime_profiling Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ons with FP32 projection output Scope 1 (prefill_scope1.py): - Scale to full Qwen3-32B: NUM_HEADS=64, MAX_SEQ=4096, HIDDEN=8192 - Change Q/K/V projection output from BF16 to FP32, aligning with decode_scope1 so scope 2 receives FP32 inputs without extra round-trip - Remove intermediate FP32 buffer + BF16 cast; assemble matmul accumulator directly to output tensor - Increase TOK_TILE from 16 to 64 for fewer task blocks - Simplify golden: full-sequence vectorization, FP32 output - Tighten tolerance from 2e-3 to 1e-3 Scope 1 wIOBuffer (prefill_scope1_wIOBuffer.py) [NEW]: - I/O buffer caching wrapper for scope 1 via io_cache plugin - wrap_golden_multi() helper for multi-output golden caching Scope 3 (prefill_scope3.py): - Scale to full Qwen3-32B: HIDDEN=8192, INTERMEDIATE=25600, MAX_SEQ=4096 - Increase TOK_TILE to 64, MLP_OUT_CHUNK to 128 - Simplify golden: full-sequence vectorization, pre-convert weights to FP32 - Clean up stage numbering and comments Scope 3 wIOBuffer (prefill_scope3_wIOBuffer.py): - Fix parameter name: enable_profiling -> runtime_profiling
Summary
RunConfig.enable_profilingtoRunConfig.runtime_profiling