Spatial Task Placement for Multi-CGRA #254

guosran · 2026-01-30T02:06:27Z

Summary

This update introduces a spatial mapping and memory management framework for mapping Minimized Canonicalized Tasks (MCTs) onto multi-core CGRA grids.

Key Features

1. Dependency & Fusion Analysis (`AnalyzeMCTDependencyPass`)

A dual-layer dependency analysis was implemented:

Dependency Graph: Detects SSA operand flow and memory-based hazards (RAW, WAR, WAW).
Fusion Detection: Specifically identifies loops with the same headers.

2. Priority-Driven Spatial Placement (`PlaceMCTOnCGRAPass`)

Critical Path Prioritization (ALAP): Mimicking mapping_utils.cpp.
Scoring Heuristic.

3. Data Forwarding (SRAM Bypass)

Locality-Aware SRAM Assignment: Maps MemRefs to the nearest physical SRAM relative to the task's CGRA position.
Direct Wire Configuration: For adjacent fusion candidates, the mapper establishes a direct interconnect path.

Tests

All tests under test/taskflow are updated with the above features. I am using simplified checks here for a more straightforward view.

…d SARA scoring This commit implements: 1. MCT dependency analysis for SSA and memory (RAW, WAR, WAW). 2. Critical path prioritization for task placement using ALAP levels. 3. SARA-style scoring heuristic for CGRA placement. 4. Memory mapping with SRAM assignment and direct wire configuration for fusion candidates. 5. Simplified and updated tests for placement verification.

guosran · 2026-01-30T02:08:04Z

test/multi-cgra/taskflow/irregular-loop/irregular-loop.mlir


+// PLACEMENT: task_name = "Task_0"
+// PLACEMENT: cgra_col = 2 : i32, cgra_count = 1 : i32, cgra_row = 1 : i32
+// PLACEMENT: task_name = "Task_1"


Simplified check here for a better view. Otherwise, it gets super messy.

is it fine?

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp

Copilot

Pull request overview

This PR introduces a dependency-analysis pass and a spatial placement pass for Minimized Canonicalized Tasks (MCTs) onto a 2D multi-CGRA grid, plus tests that validate the new placement attributes on several kernels.

Changes:

Add AnalyzeMCTDependencyPass to detect SSA and memory (RAW/WAR/WAW) dependencies between MCTs and to identify same-header fusion candidates.
Add PlaceMCTOnCGRAPass implementing a critical-path–driven placement heuristic with adjacency-aware scoring and basic memory/SRAM assignment, and register both passes in the Taskflow pass pipeline and build system.
Extend multi-CGRA Taskflow tests (parallel-nested, multi-nested, irregular-loop) with new RUN lines that invoke the placement pass and FileCheck the resulting taskflow.task placement attributes.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
test/multi-cgra/taskflow/parallel-nested/parallel-nested.mlir	Adds a placement pipeline invocation and PLACEMENT checks for 2 tasks’ CGRA coordinates and counts.
test/multi-cgra/taskflow/multi-nested/multi-nested.mlir	Adds placement pipeline and PLACEMENT checks for 5 tasks’ CGRA coordinates and counts in a more complex nested-loop scenario.
test/multi-cgra/taskflow/irregular-loop/irregular-loop.mlir	Extends the irregular-loop test with a placement pipeline and PLACEMENT checks for 3 tasks’ CGRA coordinates and counts.
lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp	Implements the CGRA placer, including counter-chain extraction, dependency graph construction, ALAP-based task prioritization, heuristic scoring, placement annotation, and memory/SRAM mapping.
lib/TaskflowDialect/Transforms/CMakeLists.txt	Registers the new analysis and placement passes with the Taskflow transforms library build.
lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp	Implements MCT dependency analysis, printing detailed per-task counter-chain and read/write sets plus a dependency summary.
include/TaskflowDialect/TaskflowPasses.td	Declares the `analyze-mct-dependency` and `place-mct-on-cgra` passes with documentation used by MLIR’s pass infrastructure.
include/TaskflowDialect/TaskflowPasses.h	Declares factory functions for constructing the new Taskflow passes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp

1. Added explicit C++ standard library headers (<algorithm>, <climits>, <cmath>, <string>, <vector>) to avoid transitive include dependencies. 2. Added error handling for grid over-subscription case in findBestPlacement(): when no available CGRA position is found, emits a warning and falls back to position (0,0).

1. Changed output from llvm::outs() to llvm::errs() to avoid stdout/IR conflicts. 2. Simplified Value printing to avoid IR ownership issues during output. 3. Added dependency-analysis.mlir test to verify SSA dependency detection. Addresses Copilot review comment about missing tests for analyze-mct-dependency pass.

guosran · 2026-01-30T02:34:16Z

test/multi-cgra/taskflow/dependency-analysis/dependency-analysis.mlir

+// DEPENDENCY: === MCT Dependency Analysis ===
+// DEPENDENCY: Found 2 MCTs
+// DEPENDENCY: MCT 0: Task_0
+// DEPENDENCY: MCT 1: Task_1


Added according to @copilot 's suggestion, but i doubt whether it is necessary...

okay even Copilot itself thinks it should be removed

It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.

Copilot · 2026-01-30T02:34:24Z

@guosran I've opened a new pull request, #255, to work on those changes. Once the pull request is ready, I'll request review from you.

ShangkunLi · 2026-01-30T05:32:32Z

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp

+// This pass identifies:
+// 1. SSA dependencies: Task output → Task input (data flow).
+// 2. Memory dependencies: RAW, WAR, WAW via shared memrefs.
+// 3. Same-header pairs: Fusion candidates for data forwarding.


The same header optimization should not be nested in a dependency analysis pass. Please mimic the neura dialect to create an Optimizations folder like lib/TaskflowDialect/Transforms/Optimizations for these optimizations.

ShangkunLi · 2026-01-30T05:35:29Z

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp

+//===----------------------------------------------------------------------===//
+/// Represents the counter chain (loop header bounds) of an MCT.
+struct CounterChainInfo {
+  SmallVector<int64_t> bounds; // e.g., {4, 8, 6} for 0→4, 0→8, 0→6.


This may not be sufficient for any static affine loops. Because general loops have {lower_bound, upper_bound, step}.

ShangkunLi · 2026-01-30T06:03:54Z

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp

+  SetVector<Value> source_memref_reads;  // Source memrefs (function args or task outputs).
+  SetVector<Value> source_memref_writes; // Source memrefs that are written.


I think you misunderstood the memory_inputs and memory_outputs defined in taskflow.task.

The memory inputs actually indicate that this task depends on which memref (both read & write). The memory outputs mean that which memref is changed by this task.

The memory inputs mean that when all the dependent memrefs are ready, and the values are ready, the task can be triggered.

The memory outputs mean that this task will modify this memref, all the other tasks need this memref must wait until this memref is modified.

ShangkunLi · 2026-01-30T06:08:03Z

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp

+      // Checks SSA dependencies: if this task's input is another task's output.
+      for (Value input : task.getMemoryInputs()) {


Why did you use memory inputs for the SSA dependency check?

The SSA dependency appears in the value inputs/outputs of task.

ShangkunLi · 2026-01-30T06:10:53Z

include/TaskflowDialect/TaskflowPasses.td

  let constructor = "taskflow::createCanonicalizeTaskPass()";
 }
+
+def AnalyzeMCTDependency : Pass<"analyze-mct-dependency", "func::FuncOp"> {


This dependency analysis pass is not only suitable for MCT but for all canonical tasks. So it should be called analyze-canonical-task-dependency.

The dependency is already explicitly identified by the task inputs & outputs, why need this analysis? And this pass does not change any ir right?

I think we should rename the Minimized Canonical Task to Atomic Canonical Task (ACT) in future optimizations, which is more consistent with its functionality (i.e., served as the input for fusion).

ShangkunLi · 2026-01-30T06:18:17Z

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp

+  }
+};
+
+//===----------------------------------------------------------------------===//


Why should we consider the SSA value dependency in mapping? Should we construct a task & memref graph for mapping?

ShangkunLi · 2026-01-30T06:20:10Z

test/multi-cgra/taskflow/dependency-analysis/dependency-analysis.mlir

+// DEPENDENCY: === MCT Dependency Analysis ===
+// DEPENDENCY: Found 2 MCTs
+// DEPENDENCY: MCT 0: Task_0
+// DEPENDENCY: MCT 1: Task_1


It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.

guosran commented Jan 30, 2026

View reviewed changes

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp Show resolved Hide resolved

guosran marked this pull request as ready for review January 30, 2026 02:09

Copilot AI review requested due to automatic review settings January 30, 2026 02:09

Copilot started reviewing on behalf of guosran January 30, 2026 02:09 View session

Copilot AI reviewed Jan 30, 2026

View reviewed changes

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp Show resolved Hide resolved

lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp Show resolved Hide resolved

lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp Show resolved Hide resolved

guosran marked this pull request as draft January 30, 2026 02:22

guosran commented Jan 30, 2026

View reviewed changes

Copilot AI mentioned this pull request Jan 30, 2026

Remove unnecessary test for diagnostic-only AnalyzeMCTDependencyPass #255

Closed

guosran marked this pull request as ready for review January 30, 2026 02:51

guosran requested a review from ShangkunLi January 30, 2026 02:51

tancheng requested review from HobbitQia and YanzhouTang January 30, 2026 04:31

ShangkunLi reviewed Jan 30, 2026

View reviewed changes

		SetVector<Value> source_memref_reads; // Source memrefs (function args or task outputs).
		SetVector<Value> source_memref_writes; // Source memrefs that are written.

		// Checks SSA dependencies: if this task's input is another task's output.
		for (Value input : task.getMemoryInputs()) {

Spatial Task Placement for Multi-CGRA #254

Are you sure you want to change the base?

Spatial Task Placement for Multi-CGRA #254

Uh oh!

Conversation

guosran commented Jan 30, 2026

Summary

Key Features

1. Dependency & Fusion Analysis (AnalyzeMCTDependencyPass)

2. Priority-Driven Spatial Placement (PlaceMCTOnCGRAPass)

3. Data Forwarding (SRAM Bypass)

Tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented Jan 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. Dependency & Fusion Analysis (`AnalyzeMCTDependencyPass`)

2. Priority-Driven Spatial Placement (`PlaceMCTOnCGRAPass`)