`maketables` plug in by s3alfisc · Pull Request #600 · pymc-labs/CausalPy

s3alfisc · 2025-12-20T15:04:56Z

Attached a very drafty PR (with lots of support from Claude) to enable maketables support for CausalPy via the new Plug-In solution @dsliwka implemented.
Attached is also an example notebook.

For the RDD class, you e.g. get something like this:

Would you generally be interested in merging this PR?

Where I'd need your help - what coefficients should be reported by default? Which statistics might users be interested in beyond the defaults? Is my choice of 94% probability intervals a good one? Etc.

Note that this PR is more of a proof of concept and might need some cleaning up and unit tests. If this makes it into CausalPy, we should likely also implement tests over there to ensure we never break CausalPy flows.

Best, Alex

📚 Documentation preview 📚: https://causalpy--600.org.readthedocs.build/en/600/

review-notebook-app · 2025-12-20T15:05:01Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

cursor · 2025-12-20T15:05:02Z

PR Summary

Introduce maketables plug-in dunder methods on BaseExperiment (coef table, stats, depvar, vcov) with Bayesian/OLS handling, plus a demo notebook.

Core (experiments):
- Add zero-coupling maketables integration on BaseExperiment via dunder methods:
  - __maketables_coef_table__: builds coefficient tables for PyMC (posterior mean, std, two-tailed posterior p, 94% HDI) and OLS (point estimates).
  - __maketables_stat__(key): returns N, model_type, experiment_type, and r2 (if available).
  - __maketables_depvar__: exposes dependent variable name.
  - __maketables_vcov_info__: returns vcov metadata.
Examples:
- Add notebooks/maketables_demo.ipynb demonstrating DiD and RD models (OLS vs PyMC) and maketables.ETable usage.

^{Written by Cursor Bugbot for commit c5bb622. This will update automatically on new commits. Configure here.}

cursor · 2025-12-20T15:09:03Z

+            widths = sorted_samples[interval_size:] - sorted_samples[:n_intervals]
+            min_idx = np.argmin(widths)
+            ci_lower = float(sorted_samples[min_idx])
+            ci_upper = float(sorted_samples[min_idx + interval_size])


HDI calculation has off-by-one error in interval bounds

The HDI (Highest Density Interval) calculation in _maketables_coef_table_bayesian has an off-by-one error. For a 94% interval containing ceil(0.94 * n) samples starting at index i, the interval spans indices i to i + interval_size - 1. However, the code computes widths = sorted_samples[interval_size:] - sorted_samples[:n_intervals], which calculates sorted_samples[i + interval_size] - sorted_samples[i] instead of sorted_samples[i + interval_size - 1] - sorted_samples[i]. Similarly, ci_upper uses sorted_samples[min_idx + interval_size] when it needs sorted_samples[min_idx + interval_size - 1]. This results in intervals containing one extra sample and potentially selecting a suboptimal HDI.

cursor · 2025-12-20T15:09:03Z

+        }
+        # Add R² if available
+        if hasattr(self, "score") and self.score is not None:
+            stat_map["r2"] = self.score


R² statistic returns Series instead of scalar value

The __maketables_stat__ method assigns self.score directly to stat_map["r2"] without handling the case where self.score is a pandas Series (which occurs for PyMC models). As shown in the notebook output, this causes the R² row in maketables to display the entire Series representation (unit_0_r2 0.836121 unit_0_r2_std 0.012656 dtype: float64) instead of a clean scalar value. Other code in the codebase (e.g., _bayesian_plot) properly extracts self.score["unit_0_r2"] when self.score is a Series.

cursor · 2025-12-20T15:09:03Z

+            interval_size = int(np.ceil(0.94 * n))
+            n_intervals = n - interval_size
+            widths = sorted_samples[interval_size:] - sorted_samples[:n_intervals]
+            min_idx = np.argmin(widths)


HDI calculation crashes with small MCMC sample sizes

The HDI calculation in _maketables_coef_table_bayesian crashes with ValueError: attempt to get argmin of an empty sequence when the number of MCMC samples is 16 or fewer. When n <= 16, interval_size = ceil(0.94 * n) equals n, making n_intervals = 0, which results in an empty widths array. Calling np.argmin on an empty array raises a ValueError. While unlikely in normal usage where MCMC runs with hundreds of samples, this could occur in quick testing scenarios with minimal sampling parameters.

codecov · 2025-12-24T14:26:43Z

Codecov Report

❌ Patch coverage is 18.00000% with 41 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.60%. Comparing base (2d6bba7) to head (c5bb622).
⚠️ Report is 164 commits behind head on main.

Files with missing lines	Patch %	Lines
causalpy/experiments/base.py	18.00%	41 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #600      +/-   ##
==========================================
- Coverage   93.27%   92.60%   -0.67%     
==========================================
  Files          37       37              
  Lines        5632     5682      +50     
  Branches      367      373       +6     
==========================================
+ Hits         5253     5262       +9     
- Misses        248      289      +41     
  Partials      131      131

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

drbenvincent · 2026-02-18T22:08:45Z

Happy to take this on - and I'll ping you if I run into any issues @s3alfisc ?

PR #600: Forward Direction for `maketables` Integration

First, thanks for pushing this. The direction is exactly right, and the proof of concept clearly shows that CausalPy + maketables can produce high-quality regression tables.

I’m very supportive of landing this capability, even if we do it via a fresh PR rather than merging this draft as-is.

Where PR #600 currently stands

PR status as of February 18, 2026: open, draft, and conflicting with current main.
It was opened on December 20, 2025, against an older base commit; main has moved substantially since then.
Core CI tests passed in that branch, but gating checks did not fully pass (pre-commit and patch coverage).
There are no targeted tests yet for the new __maketables_* behavior.

What this PR already proves

The zero-coupling plugin approach is viable for CausalPy.
A plugin on experiment results is a good architectural fit for CausalPy users.
We can provide a single path that works for both sklearn-backed and PyMC-backed experiments.

How this works without adding `maketables` as a dependency

The intended integration is an optional plugin pattern:

CausalPy does not import maketables.
CausalPy exposes plugin methods on experiment result objects (for example __maketables_coef_table__).
maketables detects these methods at runtime when a user passes a CausalPy object to ETable(...).
If maketables is not installed, nothing changes for normal CausalPy usage.

In practice, the user opts in by installing/importing maketables in their own environment:

import causalpy as cp
from maketables import ETable

result = cp.DifferenceInDifferences(...)
table = ETable(result)  # maketables reads CausalPy's __maketables_* interface

So yes: these CausalPy methods are primarily for the optional maketables pathway, while remaining harmless and inert for users who never install maketables.

Why we should not merge this draft directly

It is now behind current architecture and has merge conflicts.
It introduces behavior without a test contract.
Some statistical choices need explicit maintainer policy before we lock in defaults.
A few implementation details in the draft should be strengthened for long-term correctness and maintainability.

Concrete changes needed in CausalPy

This is the practical implementation scope required to make maketables support production-ready.

Scalable backend design (important)

The plugin API should live on BaseExperiment, but backend-specific statistics should not.

Current PR #600 status on this point:

It demonstrates backend branching, but mostly as inline logic in BaseExperiment.
It is not yet a full adapter-based separation.
So the adapter pattern described here is a recommended refinement for mergeable, scalable design.

Recommended structure:

BaseExperiment owns only thin plugin entry points (__maketables_*).
Backend-specific extraction is delegated to internal adapters selected by model type.
This avoids growing BaseExperiment into a large if isinstance(...) block over time.

Proposed internal pattern:

Add an internal adapter protocol (for example in /Users/benjamv/git/CausalPy/causalpy/reporting/maketables_adapters.py) with methods like:
coef_table(experiment) -> pd.DataFrame
stat(experiment, key) -> Any
vcov_info(experiment) -> dict[str, Any]
default_stat_keys(experiment) -> list[str] | None
stat_labels(experiment) -> dict[str, str] | None
Add a small factory that chooses adapter by model backend:
PyMCModel -> PyMCMaketablesAdapter
RegressorMixin -> SklearnMaketablesAdapter
future statsmodels backend -> StatsmodelsMaketablesAdapter

This gives us one stable plugin interface externally, with pluggable backend logic internally.

Illustrative implementation sketch:

# /Users/benjamv/git/CausalPy/causalpy/experiments/base.py

@property
def __maketables_coef_table__(self) -> pd.DataFrame:
    return get_maketables_adapter(self.model).coef_table(self)

def __maketables_stat__(self, key: str):
    return get_maketables_adapter(self.model).stat(self, key)

@property
def __maketables_depvar__(self) -> str:
    return getattr(self, "outcome_variable_name", "y")

# /Users/benjamv/git/CausalPy/causalpy/reporting/maketables_adapters.py

from typing import Protocol, Any

import pandas as pd
from sklearn.base import RegressorMixin

from causalpy.pymc_models import PyMCModel


class MaketablesAdapter(Protocol):
    def coef_table(self, experiment) -> pd.DataFrame: ...
    def stat(self, experiment, key: str) -> Any: ...
    def vcov_info(self, experiment) -> dict[str, Any]: ...
    def stat_labels(self, experiment) -> dict[str, str] | None: ...
    def default_stat_keys(self, experiment) -> list[str] | None: ...


class PyMCMaketablesAdapter:
    ...


class SklearnMaketablesAdapter:
    ...


def get_maketables_adapter(model) -> MaketablesAdapter:
    if isinstance(model, PyMCModel):
        return PyMCMaketablesAdapter()
    if isinstance(model, RegressorMixin):
        return SklearnMaketablesAdapter()
    raise TypeError(f"Unsupported model backend: {type(model)!r}")

# Later extension path for statsmodels (example)
if isinstance(model, StatsmodelsResultType):
    return StatsmodelsMaketablesAdapter()

Why this matters for contributors:

The public plugin contract stays stable (__maketables_*).
Backend differences are explicit and testable in one place per backend.
Adding a new backend is additive (new adapter + tests), not a rewrite of BaseExperiment.

1) Add plugin methods on `BaseExperiment` with robust defaults

Add plugin-facing methods/properties on /Users/benjamv/git/CausalPy/causalpy/experiments/base.py:

__maketables_coef_table__ (required)
__maketables_stat__(key) (recommended)
__maketables_depvar__ (recommended)
__maketables_vcov_info__ (optional but useful)
__maketables_stat_labels__ (optional)
__maketables_default_stat_keys__ (optional)

Design requirements:

no import of maketables from CausalPy
plugin methods on BaseExperiment should delegate to an adapter, not embed backend-specific statistical logic inline

2) Define coefficient-table behavior for both model families

For __maketables_coef_table__, return a DataFrame indexed by coefficient name (Coefficient) with canonical columns used by maketables:

Required columns: b, se, p
Optional columns we should provide: t, ci95l, ci95u

Implementation details:

PyMC models (PyMCModel):

b: posterior mean of beta
se: posterior standard deviation
p: explicitly defined Bayesian tail probability (see decision points below)
ci95l/ci95u: posterior interval bounds with explicit level definition
t: NaN (not a frequentist t-stat)

sklearn/OLS models (RegressorMixin via adaptor):

b: coefficients from get_coeffs()
se, t, p, ci95l, ci95u: NaN unless we have a principled way to compute them from the model object

Important: for multi-treated-unit Bayesian models, we should pick a deliberate policy instead of silently taking the first unit. Options: require explicit unit selection, or document/encode a deterministic aggregation or default.

To keep this scalable, the multi-unit policy should be implemented inside the Bayesian adapter, not in BaseExperiment.

3) Define model-level stats exposed to `maketables`

Implement __maketables_stat__(key) on experiments with a stable key map:

N (observations)
r2 when available (format per model type)
se_type (e.g., "Bayesian" or OLS/estimator-specific type)
Additional keys only when values are meaningful and stable

Also provide:

__maketables_depvar__ from experiment outcome variable metadata
__maketables_default_stat_keys__ to control useful defaults (for example ["N", "r2"])
__maketables_stat_labels__ if we need custom labels for Bayesian-specific rows

4) Add a targeted test suite in `causalpy/tests/`

Add dedicated tests for plugin behavior (new file(s), for example /Users/benjamv/git/CausalPy/causalpy/tests/test_maketables_plugin.py):

Presence and shape of __maketables_coef_table__
Canonical columns exist and are numeric/nullable as expected
Coefficient index name is set
__maketables_stat__ returns expected values and None for unsupported keys
__maketables_depvar__ and __maketables_vcov_info__ semantics
Behavior for both PyMC-backed and sklearn-backed experiments
Multi-unit behavior is explicitly tested

Coverage expectation: the plugin code path should have direct tests, not just notebook demonstration.

5) Add documentation example in docs structure

Move from an ad hoc notebook to docs-conforming material in /Users/benjamv/git/CausalPy/docs/source/notebooks/ with naming consistent with project conventions.

Doc goal:

Show one OLS and one PyMC example
Show a side-by-side ETable(...) example
Clearly explain Bayesian interpretation of displayed uncertainty and star behavior

Maintainer decisions needed before implementation is final

These choices should be explicit in code and docs:

What should Bayesian p mean in plugin output?
What interval level should be default in plugin columns (95% vs 94%)?
Should significance stars be shown for Bayesian models by default, and if yes, based on what exact definition?
How should multi-treated-unit experiments map to one coefficient table column per model object?

Without these decisions, integration works technically but can be statistically ambiguous for users.

Why this scales to future backends (for example `statsmodels`)

With adapter delegation:

Adding a new backend does not require rewriting plugin methods.
We implement one new adapter class and register it in the backend factory.
Existing PyMC/sklearn behavior remains unchanged and testable.

This makes backend growth additive instead of invasive.

Recommended execution plan

Phase 1: Mergeable MVP (small, test-first)

Implement plugin methods in BaseExperiment
Ship robust defaults for PyMC + sklearn paths
Add unit/integration tests for both model families
Keep scope to core plugin API and correctness

Phase 2: Documentation and examples

Add docs notebook under docs/source/notebooks/
Add short user-facing guidance on Bayesian table interpretation

Phase 3: Enhancements (optional)

Richer OLS inferential stats when model object supports them
Better multi-unit controls (user-selectable treated unit or expanded output design)
Optional additional stats labels and defaults

Suggested PR strategy

Use PR #600 as the conceptual prototype and open a fresh implementation PR against current main with:

A narrow, reviewed MVP scope
Explicit statistical defaults
Full tests for new behavior

That route should get us to the same goal faster and with much lower maintenance risk.

s3alfisc · 2026-02-18T22:30:20Z

Sounds great! Happy to help!

…600) Introduce adapter-based __maketables_* support on BaseExperiment with targeted tests and configurable PyMC HDI intervals, then wire maketables tables into DiD/ITS/SC/RD docs notebooks with backend-appropriate formatting and guidance. Made-with: Cursor

* Add maketables plugin hooks and docs integration for core notebooks (#600) Introduce adapter-based __maketables_* support on BaseExperiment with targeted tests and configurable PyMC HDI intervals, then wire maketables tables into DiD/ITS/SC/RD docs notebooks with backend-appropriate formatting and guidance. Made-with: Cursor * Expand maketables coverage and notebook examples Add maketables adapter/test coverage for additional experiments and backend-specific posterior coefficient shapes. Update affected docs notebooks with simple ETable examples, including side-by-side model comparisons where multiple fitted results are present. Made-with: Cursor * Expand maketables adapter test coverage for codecov/patch Add targeted unit tests for internal adapter helpers and edge cases that were uncovered in the codecov/patch report (62% → expected ~95%). Covers _safe_observation_count, _safe_r2_value, _canonical_frame, _extract_hdi_bounds, _get_maketables_hdi_prob, get_maketables_adapter, and SklearnMaketablesAdapter error paths. Adds integration test for PyMCMaketablesAdapter stat/metadata methods. Made-with: Cursor * Fix notebook validation and add missing docstrings after rebase Resolve merge conflicts from PR #749 in staggered_did_pymc.ipynb (re-add ETable cells and fix nbformat output metadata/name fields). Drop rkink_pymc.ipynb output-only changes in favor of main's version. Add docstrings to adapter methods to satisfy interrogate threshold. Made-with: Cursor * Fix lift test notebook compatibility with newer pymc-marketing Use the named-dimension geometric adstock call only when the installed pymc-marketing version requires it, and repair stale notebook output metadata so schema validation passes again. Made-with: Cursor * Fix lift test saturation path for newer pymc-marketing Keep the named-dimension notebook path xtensor-native through logistic_saturation so the lift test example runs on newer pymc-marketing releases while preserving the legacy code path. Made-with: Cursor

initial causalpy support via plugin

c5bb622

s3alfisc marked this pull request as draft December 20, 2025 15:06

s3alfisc mentioned this pull request Dec 20, 2025

CausalPy Support py-econometrics/maketables#17

Open

cursor Bot reviewed Dec 20, 2025

View reviewed changes

drbenvincent added the enhancement New feature or request label Dec 25, 2025

drbenvincent mentioned this pull request Feb 28, 2026

Feature parity with Google's CausalImpact #758

Open

24 tasks

drbenvincent mentioned this pull request Mar 1, 2026

Add maketables plugin MVP for core experiments #773

Merged

4 tasks

drbenvincent closed this in #773 Apr 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`maketables` plug in #600

`maketables` plug in #600
s3alfisc wants to merge 1 commit intopymc-labs:mainfrom
s3alfisc:main

s3alfisc commented Dec 20, 2025 •

edited

Loading

Uh oh!

review-notebook-app Bot commented Dec 20, 2025

Uh oh!

cursor Bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

cursor Bot Dec 20, 2025

Uh oh!

cursor Bot Dec 20, 2025

Uh oh!

cursor Bot Dec 20, 2025

Uh oh!

codecov Bot commented Dec 24, 2025 •

edited

Loading

Uh oh!

drbenvincent commented Feb 18, 2026 •

edited

Loading

Uh oh!

s3alfisc commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

s3alfisc commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app Bot commented Dec 20, 2025

Uh oh!

cursor Bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cursor Bot Dec 20, 2025

Choose a reason for hiding this comment

HDI calculation has off-by-one error in interval bounds

Uh oh!

cursor Bot Dec 20, 2025

Choose a reason for hiding this comment

R² statistic returns Series instead of scalar value

Uh oh!

cursor Bot Dec 20, 2025

Choose a reason for hiding this comment

HDI calculation crashes with small MCMC sample sizes

Uh oh!

codecov Bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

drbenvincent commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #600: Forward Direction for maketables Integration

Where PR #600 currently stands

What this PR already proves

How this works without adding maketables as a dependency

Why we should not merge this draft directly

Concrete changes needed in CausalPy

Scalable backend design (important)

1) Add plugin methods on BaseExperiment with robust defaults

2) Define coefficient-table behavior for both model families

3) Define model-level stats exposed to maketables

4) Add a targeted test suite in causalpy/tests/

5) Add documentation example in docs structure

Maintainer decisions needed before implementation is final

Why this scales to future backends (for example statsmodels)

Recommended execution plan

Phase 1: Mergeable MVP (small, test-first)

Phase 2: Documentation and examples

Phase 3: Enhancements (optional)

Suggested PR strategy

Uh oh!

s3alfisc commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

s3alfisc commented Dec 20, 2025 •

edited

Loading

cursor Bot commented Dec 20, 2025 •

edited

Loading

codecov Bot commented Dec 24, 2025 •

edited

Loading

drbenvincent commented Feb 18, 2026 •

edited

Loading

PR #600: Forward Direction for `maketables` Integration

How this works without adding `maketables` as a dependency

1) Add plugin methods on `BaseExperiment` with robust defaults

3) Define model-level stats exposed to `maketables`

4) Add a targeted test suite in `causalpy/tests/`

Why this scales to future backends (for example `statsmodels`)