Skip to content

✨ Add DLPack tensor export functionality to AIGs#308

Open
marcelwa wants to merge 32 commits into
mainfrom
dlpack-tensors
Open

✨ Add DLPack tensor export functionality to AIGs#308
marcelwa wants to merge 32 commits into
mainfrom
dlpack-tensors

Conversation

@marcelwa

@marcelwa marcelwa commented Mar 11, 2026

Copy link
Copy Markdown
Owner

Description

This pull request introduces a new feature for exporting AIG (And-Inverter Graph) networks as DLPack-compatible sparse tensors, enabling efficient, zero-copy integration with modern machine learning frameworks such as PyTorch, JAX, and TensorFlow. It adds new encoding options for node and edge attributes, updates documentation and Python/C++ interfaces, and expands optional dependencies to include PyTorch for adapters.

Major features and changes:

DLPack Tensor Export for Machine Learning

  • Adds a new to_graph_tensors method to the Aig class (Python and C++), allowing the export of graph topology and features as DLPack-compatible NumPy arrays, suitable for direct consumption by ML frameworks. The export supports configurable node and edge encodings, and optional inclusion of logic level, fanout, and truth-table features. [1] [2] [3]

Node and Edge Encoding Options

  • Introduces NodeTensorEncoding and EdgeTensorEncoding enums in both Python and C++ APIs, supporting integer and one-hot encodings for nodes, and binary, signed, or one-hot encodings for edges. These are exposed to Python and documented in both code and user documentation. [1] [2] Ff096675L113R113, [3]

Documentation and Examples

  • Expands the user documentation with a new section on DLPack sparse tensor export, including practical end-to-end examples for PyTorch and NumPy, encoding/dtype mapping, and expected tensor shapes.

Dependency Updates

  • Adds PyTorch (torch>=2.2.0) as an optional dependency for the adapters group in pyproject.toml, to support DLPack-based workflows. [1] [2]

Internal C++ Implementation

  • Implements the tensor export logic in a new C++ header (graph_tensors.hpp), handling the construction of edge and node feature arrays, DLPack-compatibility, and memory management for safe hand-off to Python and external consumers. [1] [2]

These changes together enable seamless, high-performance integration of AIG network structures with modern ML pipelines and frameworks.

Resolves #307

Checklist:

  • The pull request only contains commits that are related to it.
  • I have added appropriate tests and documentation.
  • I have made sure that all CI jobs on GitHub pass.
  • The pull request introduces no new warnings and follows the project's style guidelines.

Summary by CodeRabbit

  • New Features

    • Export AIG networks as graph tensors (edge_index, edge_attr, node_attr) with selectable node/edge encodings, optional level/fanout, and optional truth-table export; Python enums and to_graph_tensors method added. Sequential networks raise on unsupported export.
  • Documentation

    • New DLPack Tensors guide with end-to-end, zero-copy examples for PyTorch/JAX/TF/NumPy, dtype/shape mappings, usage notes, and performance guidance.
  • Tests

    • Comprehensive tests for encodings, shapes/dtypes, truth-table handling, DLPack interop, large graphs, and edge cases.

@marcelwa marcelwa self-assigned this Mar 11, 2026
@marcelwa marcelwa added enhancement New feature or request Python Relating the Python part of the project performance Performance related changes C++ Relating the C++ part of the project machine learning Anything related to ML integration labels Mar 11, 2026
@coderabbitai

coderabbitai Bot commented Mar 11, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds DLPack-backed graph-tensor export: C++ exporter producing contiguous NumPy-backed ndarrays (edge_index, edge_attr, node_attr), new node/edge encoding enums, nanobind Python bindings (to_graph_tensors), type-stub updates, tests, and docs with zero-copy examples for PyTorch/NumPy/JAX.

Changes

Cohort / File(s) Summary
Documentation
docs/machine_learning.md
Replaced Truth Tables section with DLPack graph-tensor guidance, added dtype/shape mappings, usage notes/limitations, end-to-end zero-copy examples and ML pipeline recommendations.
Configuration
pyproject.toml
Added torch>=2.2.0 to the adapters optional-dependencies and dependency-groups.
Type Stubs
python/aigverse/networks.pyi
Added NodeTensorEncoding and EdgeTensorEncoding enums and Aig.to_graph_tensors(...) / SequentialAig.to_graph_tensors(...) stubs with encoding options and include flags.
Core C++ Implementation
src/aigverse/networks/graph_tensors.hpp
New header implementing to_graph_tensors and make_owned_ndarray: builds edge_index (2×E), edge_attr (E×D_edge), node_attr (N×D_node) with encoding modes, optional level/fanout/truth-table (TT up to 16 PIs), and returns NumPy-backed ndarrays owned via capsule.
Python Bindings
src/aigverse/networks/logic_networks.cpp
Exposes NodeTensorEncoding/EdgeTensorEncoding enums and binds to_graph_tensors on network types (defaults: INTEGER, BINARY, include_level=true); includes docstring and SequentialAig error path.
Tests
test/networks/test_graph_tensors.py, test/networks/test_sequential_aig.py
Comprehensive tests for enum exposure, tensor shapes/dtypes across encodings/options, DLPack interoperability (Torch/NumPy aliasing), truth-table expansion, large/empty AIG cases, and SequentialAig error behavior.

Sequence Diagram(s)

sequenceDiagram
    participant User as Python User
    participant AIG as AIG Network
    participant Export as C++ Export Engine
    participant DLPack as NumPy ndarray / DLPack
    participant Framework as PyTorch / JAX Consumer

    User->>AIG: to_graph_tensors(node_encoding, edge_encoding, options)
    AIG->>Export: request topology & attributes
    Export->>Export: build edge_index, edge_attr, node_attr (encodings, level, fanout, TT)
    Export->>DLPack: create owned ndarrays (DLPack-capable)
    Export->>User: return dict of ndarrays / DLPack capsules
    User->>Framework: torch.from_dlpack(...) / jax.dlpack.from_dlpack(...)
    Framework-->>User: usable tensors (zero-copy)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #285: Overlaps the same network binding/stub area; earlier changes to network binding surface and stubs.
  • PR #298: Related edits to Aig/SequentialAig Python bindings and stubs; touches same API surface.
  • PR #297: Migration/refactor of binding layer to nanobind; directly relevant to binding additions here.

Suggested labels

documentation

Poem

🐰 I hop through bits and tensors bright,
I stitch the graphs for zero-copy flight,
Capsules snug and contiguous rows,
From C++ burrow the PyTorch wind blows,
I twitch my whiskers—data swiftly goes. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding DLPack tensor export functionality to AIGs, which is the primary feature across all modified files.
Description check ✅ Passed The PR description comprehensively covers major features, changes, documentation updates, and dependency additions with clear links to specific changes. However, the checklist items are not completed (all marked unchecked).
Linked Issues check ✅ Passed The PR fully implements all objectives from issue #307: DLPack tensor export with configurable node/edge encodings, zero-copy data transfer, shape specifications (edge_index: 2×E, edge_attr: E×D_edge, node_attr: N×D_node), optional features (level, fanout, truth table), and comprehensive documentation with examples.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing DLPack tensor export: documentation updates, Python/C++ API additions, test coverage, and torch dependency addition. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.77% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dlpack-tensors

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread src/aigverse/networks/graph_tensors.hpp Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pyproject.toml (1)

44-44: Split PyTorch into a dedicated extra.

python/aigverse/adapters/__init__.py only gates to_networkx() on NetworkX/NumPy; torch is only imported in tests and docs, not in the adapter runtime itself. Including torch>=2.2.0 in aigverse[adapters] makes the published NetworkX extra unnecessarily heavy. A separate ml or torch extra would align end-user installs with actual runtime dependencies.

Example split
 [project.optional-dependencies]
-adapters = ["networkx>=3.0.0", "numpy>=1.23.0", "torch>=2.2.0"]
+adapters = ["networkx>=3.0.0", "numpy>=1.23.0"]
+ml = ["torch>=2.2.0"]

 [dependency-groups]
 adapters = [
     "networkx>=3.0.0",
     "numpy>=1.23.0",
-    "torch>=2.2.0",
 ]
+ml = [
+    "torch>=2.2.0",
+]
 docs = [
     { include-group = "adapters" },
+    { include-group = "ml" },
     ...
 ]
 test = [
     { include-group = "adapters" },
+    { include-group = "ml" },
     ...
 ]

Also applies to: 207-210

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 44, Remove torch from the "adapters" extra and create
a dedicated extra (e.g., "ml" or "torch") in pyproject.toml; specifically,
update the adapters extra to only include networkx and numpy and add a new extra
entry (ml/torch) that lists "torch>=2.2.0", and ensure any references to
python/aigverse/adapters/__init__.py and the to_networkx() gating remain
unchanged (since torch is only used in tests/docs). Also apply the same split to
the other extras block indicated (lines ~207-210) so the torch requirement is
removed from adapters there and placed into the new extra.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aigverse/networks/logic_networks.cpp`:
- Around line 389-425: The binding for to_graph_tensors currently accepts a
generic Ntk and thus silently upcasts SequentialAig to Aig, dropping
sequential/register info; mirror the pattern used for to_index_list /
__getstate__ / __setstate__ by adding a Sequential-aware overload or a guard:
add an overload/bind for SequentialNtk/SequentialAig (or a pre-check in the
lambda that tests ntk.is_sequential / dynamic_cast to SequentialNtk) that raises
a TypeError with the same message used for to_index_list, or else explicitly
disable/override to_graph_tensors for SequentialAig to prevent silent loss of
state; update the binding for to_graph_tensors to reference the new overload so
calls on SequentialAig trigger the error.

---

Nitpick comments:
In `@pyproject.toml`:
- Line 44: Remove torch from the "adapters" extra and create a dedicated extra
(e.g., "ml" or "torch") in pyproject.toml; specifically, update the adapters
extra to only include networkx and numpy and add a new extra entry (ml/torch)
that lists "torch>=2.2.0", and ensure any references to
python/aigverse/adapters/__init__.py and the to_networkx() gating remain
unchanged (since torch is only used in tests/docs). Also apply the same split to
the other extras block indicated (lines ~207-210) so the torch requirement is
removed from adapters there and placed into the new extra.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 640a37be-a713-4018-abeb-9f018cbae7e3

📥 Commits

Reviewing files that changed from the base of the PR and between cfab5c5 and b8e82e5.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • docs/machine_learning.md
  • pyproject.toml
  • python/aigverse/networks.pyi
  • src/aigverse/networks/graph_tensors.hpp
  • src/aigverse/networks/logic_networks.cpp
  • test/networks/test_graph_tensors.py

Comment thread src/aigverse/networks/logic_networks.cpp

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aigverse/networks/graph_tensors.hpp`:
- Line 197: The code currently computes a single po_level using
depth_ntk->depth() + 1 and reuses it for all POs (po_level, include_level),
which makes every PO appear at the same depth; instead compute each PO's level
from its driving node: for each PO iterate its driving node (e.g. the driver
returned when handling PO entries) and call its depth (or
depth_ntk->depth_of(driver) / driver->depth()) to compute per-PO po_level, then
use that per-PO value when populating node_attr and related logic (replace the
shared static_cast<float>(depth_ntk->depth() + 1) with a per-PO computation
based on the PO's driver). Ensure this change is applied in the block that
builds node_attr and in the nearby range around lines where po_level is used
(the logic that processes POs at 246-257).
- Around line 168-190: The code risks huge allocations when
include_truth_table==true because tt_dim = 2^ntk.num_pis() can explode and
node_attr is then allocated as node_count * node_dim; before computing/using
tt_dim and allocating node_attr, add a guard that computes tt_dim from
ntk.num_pis(), checks for overflow and caps (e.g., refuse truth-table export if
num_pis() exceeds a safe limit or if tt_dim * node_count would exceed a
configured max_bytes), and fail-fast or disable include_truth_table with a clear
error; perform safe size_t maths (detect multiplication overflow for tt_dim *
node_count and node_count * node_dim) and reference the symbols tt_dim,
include_truth_table, node_dim, node_count, node_attr and ntk.num_pis() when
adding the checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 86445985-5cf7-4381-b68d-e56e99b6d8cc

📥 Commits

Reviewing files that changed from the base of the PR and between b8e82e5 and d12ad38.

📒 Files selected for processing (1)
  • src/aigverse/networks/graph_tensors.hpp

Comment thread src/aigverse/networks/graph_tensors.hpp Outdated
Comment thread src/aigverse/networks/graph_tensors.hpp Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/aigverse/networks/graph_tensors.hpp (2)

202-202: ⚠️ Potential issue | 🟠 Major

Derive each PO level from its driver, not global depth.

Line 202 caches depth() + 1 once and reuses it for every PO row. That makes node_attr wrong whenever outputs have different depths.

🐛 Proposed fix
-    const auto po_level = include_level ? static_cast<float>(depth_ntk->depth() + 1) : 0.0f;
-
     const auto fill_base = [&](const size_t row, const int64_t type_index) -> size_t
     {
@@
             if (include_level)
             {
-                node_attr[feature_offset++] = po_level;
+                node_attr[feature_offset++] =
+                    static_cast<float>(depth_ntk->level(ntk.get_node(po)) + 1);
             }

Also applies to: 251-261

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/aigverse/networks/graph_tensors.hpp` at line 202, The code computes a
single po_level using depth_ntk->depth() + 1 and reuses it for all PO rows,
causing node_attr to be incorrect when outputs have different depths; change the
logic to derive po_level per-PO from its driver node (use the driver's depth or
depth attribute) when include_level is true so each PO row computes its own
per-output level instead of using the global depth_ntk->depth(); update the code
paths that build node_attr (including the block referenced by po_level and the
similar logic around lines 251-261) to compute and use this per-driver po_level
based on the PO's driver node.

173-195: ⚠️ Potential issue | 🟠 Major

Guard include_truth_table against exponential allocations.

tt_dim grows with 2^num_pis(), and node_attr is allocated from node_count * node_dim with no cap or overflow check. Large inputs can turn this path into a huge allocation or wrapped size instead of a controlled error.

🛡️ Suggested guard
+#include <limits>
+
     size_t tt_dim = 0;
@@
     if (include_truth_table)
     {
+        constexpr size_t max_truth_table_pis = 16;
+        if (static_cast<size_t>(ntk.num_pis()) > max_truth_table_pis)
+        {
+            throw std::invalid_argument("truth-table export is only supported up to 16 primary inputs");
+        }
+
         node_tts = mockturtle::simulate_nodes<aigverse::truth_table>(
             ntk, mockturtle::default_simulator<aigverse::truth_table>{static_cast<unsigned>(ntk.num_pis())});
         output_tts = mockturtle::simulate<aigverse::truth_table>(
             ntk, mockturtle::default_simulator<aigverse::truth_table>{static_cast<unsigned>(ntk.num_pis())});
@@
     const size_t node_count = ntk.size() + ntk.num_pos();
+
+    if (node_dim != 0 && node_count > std::numeric_limits<size_t>::max() / node_dim)
+    {
+        throw std::overflow_error("node_attr size overflow");
+    }
 
     std::vector<float> node_attr(node_count * node_dim, 0.0f);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/aigverse/networks/graph_tensors.hpp` around lines 173 - 195, The code
risks exponential or overflow allocations when include_truth_table is true
because tt_dim is 2^ntk.num_pis() and node_attr uses node_count * node_dim;
update the block that computes tt_dim/node_dim/node_attr (symbols:
include_truth_table, tt_dim, node_tts, output_tts, node_dim, node_count,
node_attr, ntk.num_pis()) to first bound ntk.num_pis() to a safe maximum (or
compute tt_dim and bail if it exceeds a configured max_bits or memory
threshold), validate that tt_dim + base dims cannot cause size_t overflow when
computing node_dim, and check that node_count * node_dim stays below a safe
limit (or std::numeric_limits<size_t>::max()/throw an error) before allocating
node_attr; if any check fails, return or throw a clear error instead of
attempting the allocation.
🧹 Nitpick comments (1)
docs/machine_learning.md (1)

133-138: Mention that graph-tensor export is combinational-only.

This intro reads as generally available across network types, but SequentialAig.to_graph_tensors() intentionally raises. A short note here would prevent users from following the example and then hitting a confusing runtime error on stateful nets.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/machine_learning.md` around lines 133 - 138, Clarify that graph-tensor
export is combinational-only by adding a brief note stating that
SequentialAig.to_graph_tensors() intentionally raises for stateful/sequential
networks, so the described DLPack zero-copy export applies only to combinational
AIGs; update the paragraph to warn users to avoid calling to_graph_tensors() on
stateful nets (or catch/handle the raised error) and reference
SequentialAig.to_graph_tensors() by name so readers can find the behavior in the
API.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/aigverse/networks/graph_tensors.hpp`:
- Line 202: The code computes a single po_level using depth_ntk->depth() + 1 and
reuses it for all PO rows, causing node_attr to be incorrect when outputs have
different depths; change the logic to derive po_level per-PO from its driver
node (use the driver's depth or depth attribute) when include_level is true so
each PO row computes its own per-output level instead of using the global
depth_ntk->depth(); update the code paths that build node_attr (including the
block referenced by po_level and the similar logic around lines 251-261) to
compute and use this per-driver po_level based on the PO's driver node.
- Around line 173-195: The code risks exponential or overflow allocations when
include_truth_table is true because tt_dim is 2^ntk.num_pis() and node_attr uses
node_count * node_dim; update the block that computes tt_dim/node_dim/node_attr
(symbols: include_truth_table, tt_dim, node_tts, output_tts, node_dim,
node_count, node_attr, ntk.num_pis()) to first bound ntk.num_pis() to a safe
maximum (or compute tt_dim and bail if it exceeds a configured max_bits or
memory threshold), validate that tt_dim + base dims cannot cause size_t overflow
when computing node_dim, and check that node_count * node_dim stays below a safe
limit (or std::numeric_limits<size_t>::max()/throw an error) before allocating
node_attr; if any check fails, return or throw a clear error instead of
attempting the allocation.

---

Nitpick comments:
In `@docs/machine_learning.md`:
- Around line 133-138: Clarify that graph-tensor export is combinational-only by
adding a brief note stating that SequentialAig.to_graph_tensors() intentionally
raises for stateful/sequential networks, so the described DLPack zero-copy
export applies only to combinational AIGs; update the paragraph to warn users to
avoid calling to_graph_tensors() on stateful nets (or catch/handle the raised
error) and reference SequentialAig.to_graph_tensors() by name so readers can
find the behavior in the API.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6841f6db-6583-4d32-af41-f23099e4a457

📥 Commits

Reviewing files that changed from the base of the PR and between d12ad38 and 34e475c.

📒 Files selected for processing (5)
  • docs/machine_learning.md
  • python/aigverse/networks.pyi
  • src/aigverse/networks/graph_tensors.hpp
  • src/aigverse/networks/logic_networks.cpp
  • test/networks/test_sequential_aig.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/aigverse/networks/graph_tensors.hpp (1)

141-202: ⚠️ Potential issue | 🟠 Major

Add overflow checks before allocating the export buffers.

The 16-PI cap bounds tt_dim, but these allocations still multiply graph-derived sizes directly. On large AIGs, 2 * edge_count, edge_count * edge_dim, or node_count * node_dim can wrap before the vector is allocated, and the later writes then run past the buffer.

🛡️ Suggested guard
-    std::vector<int64_t> edge_index(2 * edge_count, 0);
+    if (edge_count > std::numeric_limits<size_t>::max() / 2)
+    {
+        throw std::overflow_error("edge_index size overflow");
+    }
+    std::vector<int64_t> edge_index(2 * edge_count, 0);
@@
-    std::vector<float> edge_attr(edge_count * edge_dim, 0.0f);
+    if (edge_dim != 0 && edge_count > std::numeric_limits<size_t>::max() / edge_dim)
+    {
+        throw std::overflow_error("edge_attr size overflow");
+    }
+    std::vector<float> edge_attr(edge_count * edge_dim, 0.0f);
@@
-    std::vector<float> node_attr(node_count * node_dim, 0.0f);
+    if (node_dim != 0 && node_count > std::numeric_limits<size_t>::max() / node_dim)
+    {
+        throw std::overflow_error("node_attr size overflow");
+    }
+    std::vector<float> node_attr(node_count * node_dim, 0.0f);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/aigverse/networks/graph_tensors.hpp` around lines 141 - 202, Before
allocating edge_index, edge_attr, and node_attr, add overflow/size checks:
verify edge_count <= SIZE_MAX / 2 before computing 2 * edge_count for
edge_index; verify edge_dim > 0 and edge_count <= SIZE_MAX / edge_dim before
computing edge_count * edge_dim for edge_attr; compute node_count and node_dim
and verify node_dim > 0 and node_count <= SIZE_MAX / node_dim before allocating
node_attr; if any check fails, throw std::length_error (or std::bad_alloc) with
a clear message. Ensure you reference and guard the symbols edge_index,
edge_attr, node_attr, edge_dim, node_dim, node_count, and tt_dim when performing
these checks.
🧹 Nitpick comments (1)
docs/machine_learning.md (1)

155-165: Define N the same way the exporter does.

Right now N reads like the network size, but the implementation emits ntk.size() + ntk.num_pos() rows. Please spell out that node_attr includes the constant node and each PO as separate rows; otherwise users will read node_attr.shape[0] as off-by-one/off-by-num_pos.

📝 Suggested wording
-where $E$ is the number of edges and $N$ is the number of nodes. For edge features,
+where $E$ is the number of edges and
+$N = \texttt{aig.size} + \texttt{aig.num_pos}$ is the number of exported rows
+(including the constant node and one row per PO). For edge features,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/machine_learning.md` around lines 155 - 165, Update the documentation's
definition of N to match the exporter: state that N = ntk.size() +
ntk.num_pos(), and explicitly note that node_attr includes one row per constant
node plus one row per primary output (PO), so node_attr.shape[0] equals the
number of network nodes plus num_pos; reference the symbols node_attr,
ntk.size(), and ntk.num_pos() in the text to make the correspondence clear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/networks/test_graph_tensors.py`:
- Around line 3-5: The module-level "import torch" in test_graph_tensors.py
prevents pytest collection when torch is absent; remove that top-level import
and instead call torch = pytest.importorskip("torch") inside each of the three
torch-dependent tests (the tests that actually use torch) so the remaining 10
NumPy-only tests can be collected and run without torch installed.

---

Duplicate comments:
In `@src/aigverse/networks/graph_tensors.hpp`:
- Around line 141-202: Before allocating edge_index, edge_attr, and node_attr,
add overflow/size checks: verify edge_count <= SIZE_MAX / 2 before computing 2 *
edge_count for edge_index; verify edge_dim > 0 and edge_count <= SIZE_MAX /
edge_dim before computing edge_count * edge_dim for edge_attr; compute
node_count and node_dim and verify node_dim > 0 and node_count <= SIZE_MAX /
node_dim before allocating node_attr; if any check fails, throw
std::length_error (or std::bad_alloc) with a clear message. Ensure you reference
and guard the symbols edge_index, edge_attr, node_attr, edge_dim, node_dim,
node_count, and tt_dim when performing these checks.

---

Nitpick comments:
In `@docs/machine_learning.md`:
- Around line 155-165: Update the documentation's definition of N to match the
exporter: state that N = ntk.size() + ntk.num_pos(), and explicitly note that
node_attr includes one row per constant node plus one row per primary output
(PO), so node_attr.shape[0] equals the number of network nodes plus num_pos;
reference the symbols node_attr, ntk.size(), and ntk.num_pos() in the text to
make the correspondence clear.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4700d12b-dc8b-4b41-9819-25170a7eed91

📥 Commits

Reviewing files that changed from the base of the PR and between 34e475c and e97d2ee.

📒 Files selected for processing (3)
  • docs/machine_learning.md
  • src/aigverse/networks/graph_tensors.hpp
  • test/networks/test_graph_tensors.py

Comment thread test/networks/test_graph_tensors.py

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/aigverse/networks/graph_tensors.hpp (1)

174-201: ⚠️ Potential issue | 🟠 Major

Reinstate a total-allocation guard for truth-table-enhanced node_attr.

Line 198 and Line 201 still allow very large node_count * node_dim allocations with no explicit memory-budget cap. The PI cap helps, but large networks can still hit extreme memory pressure.

💡 Suggested guard
+    const auto checked_add = [](const size_t a, const size_t b, const char* what) -> size_t
+    {
+        if (b > std::numeric_limits<size_t>::max() - a)
+        {
+            throw std::overflow_error(what);
+        }
+        return a + b;
+    };
+
     const size_t base_dim = node_encoding == node_tensor_encoding::ONE_HOT ? node_type_one_hot_dim : 1;
     const size_t node_dim =
         base_dim + (include_level ? 1 : 0) + (include_fanout ? 1 : 0) + (include_truth_table ? tt_dim : 0);
-    const size_t node_count = ntk.size() + ntk.num_pos();
+    const size_t node_count = checked_add(ntk.size(), ntk.num_pos(), "node_count overflow");
+    const size_t node_attr_elems = checked_mul(node_count, node_dim, "node_attr size overflow");
+
+    if (include_truth_table)
+    {
+        constexpr size_t max_node_attr_bytes = size_t{1} << 30; // 1 GiB guardrail
+        if (node_attr_elems > max_node_attr_bytes / sizeof(float))
+        {
+            throw std::invalid_argument("node_attr allocation exceeds configured safety limit");
+        }
+    }
 
-    std::vector<float> node_attr(node_count * node_dim, 0.0f);
+    std::vector<float> node_attr(node_attr_elems, 0.0f);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/aigverse/networks/graph_tensors.hpp` around lines 174 - 201, The
allocation of node_attr using node_count * node_dim can still overflow or
exhaust memory for large networks; before constructing node_attr, compute a safe
total_size = node_count * node_dim (use size_t and check for multiplication
overflow) and enforce a configurable or hard cap (e.g., compare against a
MAX_NODES_ATTR_BYTES constant or runtime budget) and throw std::length_error or
std::bad_alloc if exceeded; place this guard immediately before the
vector<float> node_attr(...) creation and reference node_count, node_dim,
tt_dim, include_truth_table, ntk.size(), and ntk.num_pos() when
computing/validating total_size.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/aigverse/networks/graph_tensors.hpp`:
- Around line 140-150: The vector allocations for edge_index and edge_attr can
overflow when computing sizes (2 * edge_count and edge_count * edge_dim); add
explicit overflow checks before those allocations in the code that constructs
edge_index and edge_attr (check that edge_count <= SIZE_MAX/2 for edge_index and
edge_count <= SIZE_MAX/edge_dim for edge_attr, and validate edge_dim is
non‑zero) and fail fast (throw std::overflow_error or return an error) if the
checks fail; update the block around edge_index, edge_count, edge_dim,
edge_tensor_encoding and edge_attr to perform these guards before allocating or
writing into the vectors.

---

Duplicate comments:
In `@src/aigverse/networks/graph_tensors.hpp`:
- Around line 174-201: The allocation of node_attr using node_count * node_dim
can still overflow or exhaust memory for large networks; before constructing
node_attr, compute a safe total_size = node_count * node_dim (use size_t and
check for multiplication overflow) and enforce a configurable or hard cap (e.g.,
compare against a MAX_NODES_ATTR_BYTES constant or runtime budget) and throw
std::length_error or std::bad_alloc if exceeded; place this guard immediately
before the vector<float> node_attr(...) creation and reference node_count,
node_dim, tt_dim, include_truth_table, ntk.size(), and ntk.num_pos() when
computing/validating total_size.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a91a8f0-e472-4ddc-ba58-8440308d1200

📥 Commits

Reviewing files that changed from the base of the PR and between e97d2ee and 6035347.

📒 Files selected for processing (2)
  • docs/machine_learning.md
  • src/aigverse/networks/graph_tensors.hpp

Comment thread src/aigverse/networks/graph_tensors.hpp Outdated
@marcelwa

Copy link
Copy Markdown
Owner Author

@copilot resolve the merge conflicts in this pull request

Co-authored-by: marcelwa <19913620+marcelwa@users.noreply.github.com>

Copilot AI commented May 15, 2026

Copy link
Copy Markdown

@copilot resolve the merge conflicts in this pull request

Resolved in 5708ac6. I merged origin/main, regenerated uv.lock from the merged dependency graph, and re-ran the targeted graph-tensor/sequential-AIG tests successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ Relating the C++ part of the project enhancement New feature or request machine learning Anything related to ML integration performance Performance related changes Python Relating the Python part of the project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

✨ Zero-Copy Tensors via DLPack

3 participants