feat(dag): fallback to CPU transport for TorchTensorType(transport='n… by caosfourn · Pull Request #64239 · ray-project/ray

caosfourn · 2026-06-21T05:27:13Z

Description

This PR implements a fallback mechanism to CPU/Shared Memory transport for TorchTensorType(transport="nccl") when executed outside of Compiled Graphs (i.e. in traditional non-compiled DAGs).

Currently, specifying the "nccl" or "accelerator" transport outside of compiled graphs leads to an AssertionError (or crashes) because the communicator group (communicator_id) and communicator context have not been initialized by the Compiled Graph compiler.

To support debugging and rapid prototyping in non-compiled mode, this PR intercepts cases where no communicator has been set up inside TorchTensorType.create_channel(), emits a UserWarning, and automatically falls back to SharedMemoryType().create_channel().

Related issues

Related to #43328

Additional information

Implementation Details:

python/ray/experimental/channel/torch_tensor_type.py:
- Checks if self._communicator_id and self._communicator are both None when self.requires_accelerator() is true.
- Emits a warning informing the user about the performance trade-off.
- Redirects to host-memory SharedMemoryType channel creation.
python/ray/dag/tests/experimental/test_non_compiled_nccl_dag.py:
- Added a new unit test validating that non-compiled graphs with TorchTensorType(transport="nccl") correctly raise a UserWarning and fall back to a functional CPU execution path without crashing.

Testing:

Verified that traditional Compiled Graphs (where the compiler assigns a communicator_id) are unaffected.
Tested successfully using unit tests and mock integration suites.

gemini-code-assist

Code Review

This pull request implements a fallback mechanism to CPU/shared-memory transport with a warning when TorchTensorType requires an accelerator but is used outside of a Compiled Graph (i.e., _communicator_id is None). It also adds corresponding unit tests. A review comment points out that the warning message incorrectly refers to transport='nccl' instead of transport='accelerator', which is the correct option.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

…ccl') in non-compiled graphs Signed-off-by: CaosFourN <huynhdnhannd@gmail.com>

caosfourn requested a review from a team as a code owner June 21, 2026 05:27

gemini-code-assist Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread python/ray/experimental/channel/torch_tensor_type.py

caosfourn force-pushed the nccl-noncompiled-fallback branch 2 times, most recently from 79a793e to 1a7ba86 Compare June 21, 2026 06:03

ray-gardener Bot added core Issues that should be addressed in Ray Core community-contribution Contributed by the community labels Jun 21, 2026

feat(dag): fallback to CPU transport for TorchTensorType(transport='n…

5c3bf9a

…ccl') in non-compiled graphs Signed-off-by: CaosFourN <huynhdnhannd@gmail.com>

caosfourn force-pushed the nccl-noncompiled-fallback branch from 1a7ba86 to 5c3bf9a Compare June 22, 2026 08:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dag): fallback to CPU transport for TorchTensorType(transport='n…#64239

feat(dag): fallback to CPU transport for TorchTensorType(transport='n…#64239
caosfourn wants to merge 1 commit into
ray-project:masterfrom
caosfourn:nccl-noncompiled-fallback

caosfourn commented Jun 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

caosfourn commented Jun 21, 2026

Description

Related issues

Additional information

Implementation Details:

Testing:

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant