RegularNetwork.Launch works for non-dense UniTensors with cutensor#750
RegularNetwork.Launch works for non-dense UniTensors with cutensor#750manuschneider wants to merge 2 commits intomasterfrom
Conversation
…n cutensor is enabled fixed some comments and typos
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #750 +/- ##
==========================================
+ Coverage 35.25% 35.34% +0.08%
==========================================
Files 215 215
Lines 33095 33009 -86
Branches 13196 13135 -61
==========================================
- Hits 11669 11668 -1
+ Misses 19512 19428 -84
+ Partials 1914 1913 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d0fa8d45db
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/RegularNetwork.cpp
Outdated
| this->tensors[idx].relabels(this->label_arr[idx]); // this conflict | ||
| this->CtTree.base_nodes[idx]->is_assigned = true; | ||
| #if defined(UNI_GPU) && defined(UNI_CUQUANTUM) // gpu workflow with cuquantum | ||
| if (tn_device == Device.cuda && this->tensors[0].uten_type() == UTenType.Dense) { |
There was a problem hiding this comment.
Use cuquantum branch for all CUDA device IDs
The new guard tn_device == Device.cuda only matches GPU 0, because Device.cuda is defined as 0 while this codebase supports Device.cuda+<gpuid> for other GPUs. As a result, dense tensors on GPU 1+ no longer enter the cuquantum path and instead fall back to the generic contraction tree, which also ignores the cuquantum einsum_path produced by setOrder(optimal=true). This is a regression for multi-GPU users and can cause major performance/memory blowups on larger networks.
Useful? React with 👍 / 👎.
… in RegularNetwork.cpp
Code Review: PR #750 "RegularNetwork.Launch works for non-dense UniTensors with cuTensor"Critical Issues1. The PR title/goal is NOT achieved — non-dense UniTensors still fail at runtime Inside the if (this->tensors[0].uten_type() != UTenType.Dense) {
cytnx_error_msg(true, "[ERROR][Launch]...",
"Sparse or Block type UniTensor network contraction is not support.\n");
return UniTensor(); // also unreachable dead code
}Non-dense UniTensors on any GPU still hit this and throw. The device routing fix is orthogonally correct (multi-GPU improvement), but the PR's stated goal of resolving issue #592 is not accomplished. 2. The cuTensorNet layer fundamentally only supports dense tensors In Important Issues3. Device comparison uses magic literal Despite the PR description claiming 4. Typo fixes listed in the PR description were not applied
5.
Positive Aspects
Summary
Bottom line: The PR makes a valid multi-GPU routing fix but should not be merged under its current title/description — it does not resolve issue #592. The description should be corrected to reflect the actual fix scope, and the
|
this should fix #592