Skip to content

[Repo Assist] fix: remove numpy/scipy symbol re-exports from datasets module (issue #981)#1429

Closed
github-actions[bot] wants to merge 1 commit intomainfrom
repo-assist/fix-datasets-namespace-pollution-981-6046a9a036a4e113
Closed

[Repo Assist] fix: remove numpy/scipy symbol re-exports from datasets module (issue #981)#1429
github-actions[bot] wants to merge 1 commit intomainfrom
repo-assist/fix-datasets-namespace-pollution-981-6046a9a036a4e113

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This is an automated PR from Repo Assist, an AI assistant.

Summary

Closes #981.

Issue #981 reports that dowhy.datasets.choice and dowhy.datasets.random appear in the generated documentation as if they were part of DoWhy's own API (they aren't — they leak in from from numpy.random import choice). The same problem affects bernoulli, halfnorm, poisson, and uniform from scipy.stats.

Root Cause

datasets.py had two bare star-like imports at module level:

from numpy.random import choice                             # leaked: choice
from scipy.stats import bernoulli, halfnorm, poisson, uniform  # leaked: 4 scipy objects

Because there is no __all__ in the module, every top-level name is exported. Sphinx picks them up and shows them in the module docs with broken [source] links (because their real source is in numpy/scipy, not datasets.py).

Fix

  • Replace all three choice(...) call sites with np.random.choice(...) and remove the import.
  • Replace all bernoulli/halfnorm/poisson/uniform call sites in sales_dataset() with ss.bernoulli/ss.halfnorm/ss.poisson/ss.uniformscipy.stats is already aliased as ss — and remove the from scipy.stats import … line.

No functional change; behaviour is identical.

Test Status

  • ✅ File parses cleanly (ast.parse)
  • ✅ No remaining bare references to choice, bernoulli, halfnorm, poisson, or uniform (verified via grep)
  • ℹ️ Full test suite could not be run in this environment (dependencies not installed), but the change is purely a rename of local symbols to their qualified forms — no logic is altered.

Note

🔒 Integrity filtering filtered 109 items

Integrity filtering activated and filtered the following items during workflow execution.
This happens when a tool call accesses a resource that does not meet the required integrity or secrecy level of the workflow.

Generated by Repo Assist ·

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@b897c2f3e43bde9ff7923c8fa9211055b26e27cc

…981)

Remove bare 'from numpy.random import choice' and replace the three call
sites with 'np.random.choice(...)'. This stops the numpy 'choice' symbol
from appearing as part of dowhy.datasets's public API.

Remove bare 'from scipy.stats import bernoulli, halfnorm, poisson, uniform'
and replace all call sites in sales_dataset() with the already-imported
alias 'ss.*' (scipy.stats is already imported as 'ss'). This stops four
scipy distribution objects from polluting the module namespace and confusing
Sphinx source-link generation.

No functional change; behaviour is identical.

Closes #981

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@emrekiciman emrekiciman marked this pull request as ready for review March 30, 2026 17:28
Copy link
Copy Markdown
Member

@emrekiciman emrekiciman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, and passes tests.

@emrekiciman
Copy link
Copy Markdown
Member

looks like this problem doesn't exist in the current docs. closing without merging.

@emrekiciman emrekiciman closed this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Some [source] links in the docs do not work

1 participant