Skip to content

fix: arg/kwarg collision for local numpy vars in caching#9751

Merged
dmadisetti merged 1 commit into
mainfrom
dm/hash-fix
Jun 3, 2026
Merged

fix: arg/kwarg collision for local numpy vars in caching#9751
dmadisetti merged 1 commit into
mainfrom
dm/hash-fix

Conversation

@dmadisetti
Copy link
Copy Markdown
Collaborator

📝 Summary

We do not properly detect local (defined in same cell) kwarg/arg defs, causing invalid collisions.

Came across this when doing some benchmarking.

Copilot AI review requested due to automatic review settings June 2, 2026 02:13
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Jun 2, 2026 2:14am

Request Review

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Architecture diagram
sequenceDiagram
    participant UserCode as User Cell
    participant Cache as @mo.cache Decorator
    participant Hash as _is_memoizable()
    participant Graph as Cell Graph
    participant ContentHash as Content Hash

    Note over UserCode,ContentHash: Cache memoization check flow

    UserCode->>Cache: Call f(a=ndarray) with args/kwargs
    Cache->>Hash: Check memoizability(local_ref, value)
    Hash->>Graph: Look up definitions.get(local_ref)
    Graph-->>Hash: Return set of defining cell IDs (or empty set)
    
    alt local_ref defined in same cell
        Note over Hash: defs = {self.cell_id}
        Hash->>Hash: bool(defs) is True, self.cell_id in defs
        Hash-->>Cache: Return False (not memoizable)
        Cache->>ContentHash: Recompute content hash
        ContentHash-->>Cache: New hash from runtime value
        Cache-->>UserCode: cache miss, execute function
        
    else local_ref defined in different cell
        Note over Hash: defs = {other_cell_id}
        Hash->>Hash: bool(defs) is True, self.cell_id not in defs
        Hash-->>Cache: Return True (memoizable)
        Cache->>ContentHash: Use cached content hash
        ContentHash-->>Cache: Existing hash from storage
        Cache-->>UserCode: cache hit, return cached result
        
    else local_ref not defined in any cell (local variable)
        Note over Hash: defs = empty set
        Hash->>Hash: bool(defs) is False
        Hash-->>Cache: Return False (not memoizable)
        Cache->>ContentHash: Recompute content hash
        ContentHash-->>Cache: New hash from runtime value
        Cache-->>UserCode: cache miss, execute function
    end

    Note over UserCode,ContentHash: Key fix: distinguish local vars (empty defs) from cross-cell refs
Loading

Re-trigger cubic

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect content-hash memoization during @mo.cache calls where locally-scoped values (not defined by any cell) could be memoized by name, causing different ndarray argument values (positional or keyword) to reuse a prior digest and incorrectly collide.

Changes:

  • Restrict content-hash memoization to references that are actually defined by (other) cells in the notebook graph.
  • Add a regression test ensuring distinct NumPy ndarray arguments to a cached function never collide across calls (positional/keyword forms).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
marimo/_save/hash.py Tightens _is_memoizable so only graph-defined refs are eligible for hash memoization, preventing memo collisions for function args/kwargs.
tests/_save/test_hash.py Adds a NumPy regression test validating cached function calls with distinct ndarray inputs don’t produce erroneous cache hits.

@dmadisetti dmadisetti added the bug Something isn't working label Jun 2, 2026
Copy link
Copy Markdown
Member

@kirangadhave kirangadhave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@dmadisetti dmadisetti merged commit c0f07c0 into main Jun 3, 2026
45 of 46 checks passed
@dmadisetti dmadisetti deleted the dm/hash-fix branch June 3, 2026 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants