-
Notifications
You must be signed in to change notification settings - Fork 15
Spatial Task Placement for Multi-CGRA #254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…d SARA scoring This commit implements: 1. MCT dependency analysis for SSA and memory (RAW, WAR, WAW). 2. Critical path prioritization for task placement using ALAP levels. 3. SARA-style scoring heuristic for CGRA placement. 4. Memory mapping with SRAM assignment and direct wire configuration for fusion candidates. 5. Simplified and updated tests for placement verification.
|
|
||
| // PLACEMENT: task_name = "Task_0" | ||
| // PLACEMENT: cgra_col = 2 : i32, cgra_count = 1 : i32, cgra_row = 1 : i32 | ||
| // PLACEMENT: task_name = "Task_1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplified check here for a better view. Otherwise, it gets super messy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a dependency-analysis pass and a spatial placement pass for Minimized Canonicalized Tasks (MCTs) onto a 2D multi-CGRA grid, plus tests that validate the new placement attributes on several kernels.
Changes:
- Add
AnalyzeMCTDependencyPassto detect SSA and memory (RAW/WAR/WAW) dependencies between MCTs and to identify same-header fusion candidates. - Add
PlaceMCTOnCGRAPassimplementing a critical-path–driven placement heuristic with adjacency-aware scoring and basic memory/SRAM assignment, and register both passes in the Taskflow pass pipeline and build system. - Extend multi-CGRA Taskflow tests (parallel-nested, multi-nested, irregular-loop) with new RUN lines that invoke the placement pass and FileCheck the resulting
taskflow.taskplacement attributes.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/multi-cgra/taskflow/parallel-nested/parallel-nested.mlir | Adds a placement pipeline invocation and PLACEMENT checks for 2 tasks’ CGRA coordinates and counts. |
| test/multi-cgra/taskflow/multi-nested/multi-nested.mlir | Adds placement pipeline and PLACEMENT checks for 5 tasks’ CGRA coordinates and counts in a more complex nested-loop scenario. |
| test/multi-cgra/taskflow/irregular-loop/irregular-loop.mlir | Extends the irregular-loop test with a placement pipeline and PLACEMENT checks for 3 tasks’ CGRA coordinates and counts. |
| lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp | Implements the CGRA placer, including counter-chain extraction, dependency graph construction, ALAP-based task prioritization, heuristic scoring, placement annotation, and memory/SRAM mapping. |
| lib/TaskflowDialect/Transforms/CMakeLists.txt | Registers the new analysis and placement passes with the Taskflow transforms library build. |
| lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp | Implements MCT dependency analysis, printing detailed per-task counter-chain and read/write sets plus a dependency summary. |
| include/TaskflowDialect/TaskflowPasses.td | Declares the analyze-mct-dependency and place-mct-on-cgra passes with documentation used by MLIR’s pass infrastructure. |
| include/TaskflowDialect/TaskflowPasses.h | Declares factory functions for constructing the new Taskflow passes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1. Added explicit C++ standard library headers (<algorithm>, <climits>, <cmath>, <string>, <vector>) to avoid transitive include dependencies. 2. Added error handling for grid over-subscription case in findBestPlacement(): when no available CGRA position is found, emits a warning and falls back to position (0,0).
1. Changed output from llvm::outs() to llvm::errs() to avoid stdout/IR conflicts. 2. Simplified Value printing to avoid IR ownership issues during output. 3. Added dependency-analysis.mlir test to verify SSA dependency detection. Addresses Copilot review comment about missing tests for analyze-mct-dependency pass.
| // DEPENDENCY: === MCT Dependency Analysis === | ||
| // DEPENDENCY: Found 2 MCTs | ||
| // DEPENDENCY: MCT 0: Task_0 | ||
| // DEPENDENCY: MCT 1: Task_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added according to @copilot 's suggestion, but i doubt whether it is necessary...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay even Copilot itself thinks it should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.
| // This pass identifies: | ||
| // 1. SSA dependencies: Task output → Task input (data flow). | ||
| // 2. Memory dependencies: RAW, WAR, WAW via shared memrefs. | ||
| // 3. Same-header pairs: Fusion candidates for data forwarding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same header optimization should not be nested in a dependency analysis pass. Please mimic the neura dialect to create an Optimizations folder like lib/TaskflowDialect/Transforms/Optimizations for these optimizations.
| //===----------------------------------------------------------------------===// | ||
| /// Represents the counter chain (loop header bounds) of an MCT. | ||
| struct CounterChainInfo { | ||
| SmallVector<int64_t> bounds; // e.g., {4, 8, 6} for 0→4, 0→8, 0→6. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not be sufficient for any static affine loops. Because general loops have {lower_bound, upper_bound, step}.
| SetVector<Value> source_memref_reads; // Source memrefs (function args or task outputs). | ||
| SetVector<Value> source_memref_writes; // Source memrefs that are written. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you misunderstood the memory_inputs and memory_outputs defined in taskflow.task.
The memory inputs actually indicate that this task depends on which memref (both read & write). The memory outputs mean that which memref is changed by this task.
The memory inputs mean that when all the dependent memrefs are ready, and the values are ready, the task can be triggered.
The memory outputs mean that this task will modify this memref, all the other tasks need this memref must wait until this memref is modified.
| // Checks SSA dependencies: if this task's input is another task's output. | ||
| for (Value input : task.getMemoryInputs()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you use memory inputs for the SSA dependency check?
The SSA dependency appears in the value inputs/outputs of task.
| let constructor = "taskflow::createCanonicalizeTaskPass()"; | ||
| } | ||
|
|
||
| def AnalyzeMCTDependency : Pass<"analyze-mct-dependency", "func::FuncOp"> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This dependency analysis pass is not only suitable for MCT but for all canonical tasks. So it should be called
analyze-canonical-task-dependency. - The dependency is already explicitly identified by the task inputs & outputs, why need this analysis? And this pass does not change any ir right?
- I think we should rename the Minimized Canonical Task to Atomic Canonical Task (ACT) in future optimizations, which is more consistent with its functionality (i.e., served as the input for fusion).
| } | ||
| }; | ||
|
|
||
| //===----------------------------------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should we consider the SSA value dependency in mapping? Should we construct a task & memref graph for mapping?
| // DEPENDENCY: === MCT Dependency Analysis === | ||
| // DEPENDENCY: Found 2 MCTs | ||
| // DEPENDENCY: MCT 0: Task_0 | ||
| // DEPENDENCY: MCT 1: Task_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.
Summary
This update introduces a spatial mapping and memory management framework for mapping Minimized Canonicalized Tasks (MCTs) onto multi-core CGRA grids.
Key Features
1. Dependency & Fusion Analysis (
AnalyzeMCTDependencyPass)A dual-layer dependency analysis was implemented:
2. Priority-Driven Spatial Placement (
PlaceMCTOnCGRAPass)mapping_utils.cpp.3. Data Forwarding (SRAM Bypass)
Tests
test/taskfloware updated with the above features. I am using simplified checks here for a more straightforward view.