Skip to content

Conversation

@guosran
Copy link
Collaborator

@guosran guosran commented Jan 30, 2026

Summary

This update introduces a spatial mapping and memory management framework for mapping Minimized Canonicalized Tasks (MCTs) onto multi-core CGRA grids.

Key Features

1. Dependency & Fusion Analysis (AnalyzeMCTDependencyPass)

A dual-layer dependency analysis was implemented:

  • Dependency Graph: Detects SSA operand flow and memory-based hazards (RAW, WAR, WAW).
  • Fusion Detection: Specifically identifies loops with the same headers.

2. Priority-Driven Spatial Placement (PlaceMCTOnCGRAPass)

  • Critical Path Prioritization (ALAP): Mimicking mapping_utils.cpp.
  • Scoring Heuristic.

3. Data Forwarding (SRAM Bypass)

  • Locality-Aware SRAM Assignment: Maps MemRefs to the nearest physical SRAM relative to the task's CGRA position.
  • Direct Wire Configuration: For adjacent fusion candidates, the mapper establishes a direct interconnect path.

Tests

  • All tests under test/taskflow are updated with the above features. I am using simplified checks here for a more straightforward view.

…d SARA scoring

This commit implements:
1. MCT dependency analysis for SSA and memory (RAW, WAR, WAW).
2. Critical path prioritization for task placement using ALAP levels.
3. SARA-style scoring heuristic for CGRA placement.
4. Memory mapping with SRAM assignment and direct wire configuration for fusion candidates.
5. Simplified and updated tests for placement verification.

// PLACEMENT: task_name = "Task_0"
// PLACEMENT: cgra_col = 2 : i32, cgra_count = 1 : i32, cgra_row = 1 : i32
// PLACEMENT: task_name = "Task_1"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified check here for a better view. Otherwise, it gets super messy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it fine?

@guosran guosran marked this pull request as ready for review January 30, 2026 02:09
Copilot AI review requested due to automatic review settings January 30, 2026 02:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a dependency-analysis pass and a spatial placement pass for Minimized Canonicalized Tasks (MCTs) onto a 2D multi-CGRA grid, plus tests that validate the new placement attributes on several kernels.

Changes:

  • Add AnalyzeMCTDependencyPass to detect SSA and memory (RAW/WAR/WAW) dependencies between MCTs and to identify same-header fusion candidates.
  • Add PlaceMCTOnCGRAPass implementing a critical-path–driven placement heuristic with adjacency-aware scoring and basic memory/SRAM assignment, and register both passes in the Taskflow pass pipeline and build system.
  • Extend multi-CGRA Taskflow tests (parallel-nested, multi-nested, irregular-loop) with new RUN lines that invoke the placement pass and FileCheck the resulting taskflow.task placement attributes.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/multi-cgra/taskflow/parallel-nested/parallel-nested.mlir Adds a placement pipeline invocation and PLACEMENT checks for 2 tasks’ CGRA coordinates and counts.
test/multi-cgra/taskflow/multi-nested/multi-nested.mlir Adds placement pipeline and PLACEMENT checks for 5 tasks’ CGRA coordinates and counts in a more complex nested-loop scenario.
test/multi-cgra/taskflow/irregular-loop/irregular-loop.mlir Extends the irregular-loop test with a placement pipeline and PLACEMENT checks for 3 tasks’ CGRA coordinates and counts.
lib/TaskflowDialect/Transforms/PlaceMCTOnCGRAPass.cpp Implements the CGRA placer, including counter-chain extraction, dependency graph construction, ALAP-based task prioritization, heuristic scoring, placement annotation, and memory/SRAM mapping.
lib/TaskflowDialect/Transforms/CMakeLists.txt Registers the new analysis and placement passes with the Taskflow transforms library build.
lib/TaskflowDialect/Transforms/AnalyzeMCTDependencyPass.cpp Implements MCT dependency analysis, printing detailed per-task counter-chain and read/write sets plus a dependency summary.
include/TaskflowDialect/TaskflowPasses.td Declares the analyze-mct-dependency and place-mct-on-cgra passes with documentation used by MLIR’s pass infrastructure.
include/TaskflowDialect/TaskflowPasses.h Declares factory functions for constructing the new Taskflow passes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

1. Added explicit C++ standard library headers (<algorithm>, <climits>, <cmath>, <string>, <vector>) to avoid transitive include dependencies.
2. Added error handling for grid over-subscription case in findBestPlacement(): when no available CGRA position is found, emits a warning and falls back to position (0,0).
@guosran guosran marked this pull request as draft January 30, 2026 02:22
1. Changed output from llvm::outs() to llvm::errs() to avoid stdout/IR conflicts.
2. Simplified Value printing to avoid IR ownership issues during output.
3. Added dependency-analysis.mlir test to verify SSA dependency detection.

Addresses Copilot review comment about missing tests for analyze-mct-dependency pass.
// DEPENDENCY: === MCT Dependency Analysis ===
// DEPENDENCY: Found 2 MCTs
// DEPENDENCY: MCT 0: Task_0
// DEPENDENCY: MCT 1: Task_1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added according to @copilot 's suggestion, but i doubt whether it is necessary...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay even Copilot itself thinks it should be removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.

Copy link

Copilot AI commented Jan 30, 2026

@guosran I've opened a new pull request, #255, to work on those changes. Once the pull request is ready, I'll request review from you.

@guosran guosran marked this pull request as ready for review January 30, 2026 02:51
@guosran guosran requested a review from ShangkunLi January 30, 2026 02:51
// This pass identifies:
// 1. SSA dependencies: Task output → Task input (data flow).
// 2. Memory dependencies: RAW, WAR, WAW via shared memrefs.
// 3. Same-header pairs: Fusion candidates for data forwarding.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same header optimization should not be nested in a dependency analysis pass. Please mimic the neura dialect to create an Optimizations folder like lib/TaskflowDialect/Transforms/Optimizations for these optimizations.

//===----------------------------------------------------------------------===//
/// Represents the counter chain (loop header bounds) of an MCT.
struct CounterChainInfo {
SmallVector<int64_t> bounds; // e.g., {4, 8, 6} for 0→4, 0→8, 0→6.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be sufficient for any static affine loops. Because general loops have {lower_bound, upper_bound, step}.

Comment on lines +67 to +68
SetVector<Value> source_memref_reads; // Source memrefs (function args or task outputs).
SetVector<Value> source_memref_writes; // Source memrefs that are written.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you misunderstood the memory_inputs and memory_outputs defined in taskflow.task.

The memory inputs actually indicate that this task depends on which memref (both read & write). The memory outputs mean that which memref is changed by this task.

The memory inputs mean that when all the dependent memrefs are ready, and the values are ready, the task can be triggered.

The memory outputs mean that this task will modify this memref, all the other tasks need this memref must wait until this memref is modified.

Comment on lines +189 to +190
// Checks SSA dependencies: if this task's input is another task's output.
for (Value input : task.getMemoryInputs()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you use memory inputs for the SSA dependency check?

The SSA dependency appears in the value inputs/outputs of task.

let constructor = "taskflow::createCanonicalizeTaskPass()";
}

def AnalyzeMCTDependency : Pass<"analyze-mct-dependency", "func::FuncOp"> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This dependency analysis pass is not only suitable for MCT but for all canonical tasks. So it should be called analyze-canonical-task-dependency.
  2. The dependency is already explicitly identified by the task inputs & outputs, why need this analysis? And this pass does not change any ir right?
  3. I think we should rename the Minimized Canonical Task to Atomic Canonical Task (ACT) in future optimizations, which is more consistent with its functionality (i.e., served as the input for fusion).

}
};

//===----------------------------------------------------------------------===//
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we consider the SSA value dependency in mapping? Should we construct a task & memref graph for mapping?

// DEPENDENCY: === MCT Dependency Analysis ===
// DEPENDENCY: Found 2 MCTs
// DEPENDENCY: MCT 0: Task_0
// DEPENDENCY: MCT 1: Task_1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this pass does not have any "real" modifications on the pass. We can add these optimizations later, let's just focus on mapping in this pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants