Skip to content

[BYOC] Enable mcBlas/mcDnn#14

Open
JoieAli wants to merge 4 commits into
MetaX-MACA:v0.18.0from
JoieAli:v0.18.0
Open

[BYOC] Enable mcBlas/mcDnn#14
JoieAli wants to merge 4 commits into
MetaX-MACA:v0.18.0from
JoieAli:v0.18.0

Conversation

@JoieAli
Copy link
Copy Markdown
Collaborator

@JoieAli JoieAli commented Oct 14, 2025

Change-Id: I1cae7e03deb667afb2564cd4c54c195c45f28038

Change-Id: I1cae7e03deb667afb2564cd4c54c195c45f28038
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @JoieAli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands TVM's capabilities by integrating the mcBLAS and mcDNN libraries, specifically tailored for the MACA backend. The changes span the entire stack, from build system configuration and Python APIs to low-level C++ runtime and code generation. This integration aims to leverage specialized hardware acceleration for linear algebra and deep learning primitives, offering improved performance for models deployed on MACA-compatible devices.

Highlights

  • CMake Configuration Updates: The build system has been updated to enable USE_MCDNN and USE_MCBLAS flags, while USE_FLASHATTN is now explicitly set to OFF. Corresponding library information and conditional compilation logic have been added for these new backends under the USE_MACA umbrella.
  • Core Library Integration: New C++ and Python interfaces have been introduced for mcBLAS (for matrix multiplication) and mcDNN (for deep neural network operations like convolution, softmax, and attention). This includes TOPI compute and schedule definitions, as well as tensor intrinsics for WMMA operations.
  • Relay and Relax Backend Support: Comprehensive backend support has been added for both Relay and Relax, including pattern tables for graph partitioning, lowering functions, and C++ codegen implementations to offload compatible operations to mcBLAS and mcDNN.
  • Testing and Utilities: New Python tests have been added for mcBLAS and mcDNN operations to ensure correctness. Additionally, new testing utility flags (requires_mcblas, requires_mcdnn) and updates to mxcc.py for compute capability checks enhance the testing framework.
  • Runtime Enhancements: C++ runtime components for mcBLAS and mcDNN have been implemented, including JSON runtimes, utility functions for error handling and data type conversions, and integration with the MACA stream management.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables mcBlas and mcDnn as BYOC backends for the MACA target. The changes are extensive, touching the CMake build system, TOPI operator library, Relay and Relax BYOC infrastructure, and runtime implementations. The overall approach is sound and follows existing patterns in TVM. However, I've found a few critical issues that need to be addressed, such as undefined variables in Python scripts which will lead to runtime errors. There are also some minor inconsistencies and what appears to be accidentally included files for a 'mctlass' backend. Please see my detailed comments for suggestions.

Comment thread python/tvm/contrib/mcdnn.py
Comment thread python/tvm/relay/op/strategy/maca.py
Comment on lines +113 to +123
return mcdnn.conv_forward(
data,
kernel,
[pad_d, pad_h, pad_w],
[stride_d, stride_h, stride_w],
[dilation_d, dilation_h, dilation_w],
conv_mode=1,
tensor_format=tensor_format,
algo=cfg["algo"].val,
conv_dtype=dtype,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The variable dtype is used on line 122 but it is not defined within the conv3d_mcdnn function. This will cause a NameError. It seems you intended to use out_dtype for the conv_dtype parameter.

Suggested change
return mcdnn.conv_forward(
data,
kernel,
[pad_d, pad_h, pad_w],
[stride_d, stride_h, stride_w],
[dilation_d, dilation_h, dilation_w],
conv_mode=1,
tensor_format=tensor_format,
algo=cfg["algo"].val,
conv_dtype=dtype,
)
return mcdnn.conv_forward(
data,
kernel,
[pad_d, pad_h, pad_w],
[stride_d, stride_h, stride_w],
[dilation_d, dilation_h, dilation_w],
conv_mode=1,
tensor_format=tensor_format,
algo=cfg["algo"].val,
conv_dtype=out_dtype,
)

Comment on lines +23 to +53
"""Create an extern op that compute matrix mult of A and rhs with mcBLAS

Parameters
----------
lhs : Tensor
The left matrix operand
rhs : Tensor
The right matrix operand
transa : bool
Whether transpose lhs
transb : bool
Whether transpose rhs

Returns
-------
C : Tensor
The result tensor.
"""
if n == 0:
n = lhs.shape[1] if transa else lhs.shape[0]
if m == 0:
m = rhs.shape[0] if transb else rhs.shape[1]
dtype = dtype if dtype is not None else lhs.dtype
return te.extern(
(n, m),
[lhs, rhs],
lambda ins, outs: tvm.tir.call_packed(
"tvm.contrib.mcblaslt.matmul", ins[0], ins[1], outs[0], transa, transb
),
dtype=dtype,
name="C",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are a couple of minor issues here for better clarity and maintainability:

  1. The docstring on line 23 says "mcBLAS" but this module is for "mcBLASlt". It should be updated to avoid confusion.
  2. The name for the te.extern call on line 53 is "C", which is very generic. It would be better to use a more descriptive name like "matmul_mcblaslt" to avoid potential name clashes and improve readability.
    """Create an extern op that compute matrix mult of A and rhs with mcBLASlt

    Parameters
    ----------
    lhs : Tensor
        The left matrix operand
    rhs : Tensor
        The right matrix operand
    transa : bool
        Whether transpose lhs
    transb : bool
        Whether transpose rhs

    Returns
    -------
    C : Tensor
        The result tensor.
    """
    if n == 0:
        n = lhs.shape[1] if transa else lhs.shape[0]
    if m == 0:
        m = rhs.shape[0] if transb else rhs.shape[1]
    dtype = dtype if dtype is not None else lhs.dtype
    return te.extern(
        (n, m),
        [lhs, rhs],
        lambda ins, outs: tvm.tir.call_packed(
            "tvm.contrib.mcblaslt.matmul", ins[0], ins[1], outs[0], transa, transb
        ),
        dtype=dtype,
        name="matmul_mcblaslt",
    )

Comment on lines +1 to +69
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

/*!
* \file src/relay/backend/contrib/mctlass/codegen.h
* \brief The 'custom' compilation pass for MCTLASS (invoked by the RelayToTIRTargetHook pass).
*/

#ifndef TVM_RELAY_BACKEND_CONTRIB_MCTLASS_CODEGEN_H_
#define TVM_RELAY_BACKEND_CONTRIB_MCTLASS_CODEGEN_H_

#include <tvm/ir/transform.h>

#include <string>
#include <vector>

#include "../codegen_c/codegen_c.h"

namespace tvm {
namespace relay {
namespace contrib {
namespace mctlass {

/*!
* \brief Returns the pass which replaces all calls to "Primitive" functions with "Compiler"
* attribute of "mctlass" with an call to an extern, and binds a \p runtime::StaticLibrary
* to the IRModule's "external_mods" attribute containing compiled implementations of
* those functions using the MCTLASS C++ template library.
*/
transform::Pass CompileForCutlass();

// The rest is sparsely documented since they are exposed only for code sharing between Relay
// and Relax backend implementations.

/*! \brief Emit the function signature for a kernel */
std::string EmitSignature(const std::vector<relay::contrib::Output>& out,
const std::string& func_id, const std::vector<std::string>& arg_names);

/*! \brief Generate the body of the kernel */
GenerateBodyOutput GenerateBody(const std::string& func_name, const std::string& ext_func_id,
const std::vector<std::string>& output_types,
const Array<String>& func_args, const Map<String, ObjectRef>& attrs,
int* buf_idx);

/*! \brief Create a C-source module from the given kernel string */
runtime::Module Finalize(const std::string& code, const Array<String>& func_names);

} // namespace mctlass
} // namespace contrib
} // namespace relay
} // namespace tvm

#endif // TVM_RELAY_BACKEND_CONTRIB_MCTLASS_CODEGEN_H_
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file, along with other files under src/relay/backend/contrib/mctlass/, seems to have been accidentally included in this pull request. The PR is about mcblas and mcdnn, but these files are for a mctlass backend which is not wired into the build system in this PR. Please consider removing them to keep the PR focused and avoid adding unused code.

JoieAli and others added 3 commits October 14, 2025 18:23
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant