Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
PR SummaryMedium Risk Overview Adds Written by Cursor Bugbot for commit c244263. This will update automatically on new commits. Configure here. |
| # Reduce to (chain, draw, n_o, n_o); dim names may vary | ||
| cov_arr = np.asarray(cov_post).squeeze() | ||
| if cov_arr.ndim != 4: | ||
| cov_arr = cov_arr.reshape((-1, -1, n_o, n_o)) # (chain, draw, n_o, n_o) |
There was a problem hiding this comment.
Two -1 dimensions in numpy reshape causes crash
High Severity
cov_arr.reshape((-1, -1, n_o, n_o)) uses two -1 dimensions, which numpy does not allow — only one unknown dimension can be specified. This will always raise a ValueError at runtime when cov_arr.ndim != 4, making print_coefficients crash whenever it tries to print the residual covariance for MultivarLinearReg.
| self.pre_y_raw = np.column_stack(pre_y_list) # (n_pre, n_outcomes) | ||
| self.post_y_raw = np.column_stack(post_y_list) # (n_post, n_outcomes) | ||
| self.pre_X = pre_X_list[0] | ||
| self.post_X = post_X_list[0] |
There was a problem hiding this comment.
Only first formula's X used for all outcomes
Medium Severity
When multiple formulas with different right-hand sides are provided (e.g., ["y1 ~ 1 + t + x1", "y2 ~ 1 + t + x2"]), only the first formula's design matrix X is used for fitting all outcomes via MultivarLinearReg. The per-outcome pre_X_list and post_X_list are stored but never used for model fitting, silently producing incorrect estimates for non-first outcomes with different predictors. The docstring's example encourages exactly this usage.
Additional Locations (1)
| Object with .table (DataFrame) and .text (str) attributes. | ||
| In the multivariate case (multiple outcomes), the table has an | ||
| ``outcome`` column and a ``statistic`` column (e.g. "average", | ||
| "cumulative"), with one row per (outcome, statistic). |
There was a problem hiding this comment.
Multivariate comparison summary crashes on float conversion
Medium Severity
Calling effect_summary(period="comparison") on a multivariate ITS crashes. The period="comparison" early return bypasses the new multivariate handling added at lines 1732–1773. The delegated _comparison_period_summary then calls float(intervention_avg.mean(dim=["chain", "draw"]).values) on data that still carries the outcomes dimension, producing a multi-element array where a scalar is expected. This is caused by the new algorithm() producing impact data with an outcomes dimension that the comparison path never squeezes.
Additional Locations (1)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #724 +/- ##
==========================================
- Coverage 94.36% 93.82% -0.54%
==========================================
Files 44 45 +1
Lines 7591 8101 +510
Branches 461 515 +54
==========================================
+ Hits 7163 7601 +438
- Misses 264 321 +57
- Partials 164 179 +15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|


Acknowledgments & Disclaimer
This PR involved significant architectural changes and required taking a fair amount of initiative. I had to make several design choices that may still need refinement. Please feel free to challenge my approach and suggest improvements; I am very open to feedback to ensure this aligns with the project's standards.
To showcase these new features, I have included a demo notebook in this PR which covers the full scope of the implementation.
1. New Model: MultivarLinearReg
I have introduced a new model capable of handling multiple outcomes simultaneously.
Cholesky Decomposition: The model uses Cholesky decomposition for the covariance matrix. This is a robust approach because it ensures the matrix remains symmetric and positive-definite, while significantly improving computational efficiency and numerical stability compared to a direct matrix inversion.
2. Structural Refactor: New "Outcome" Dimension
I updated the data structure by introducing a dedicated outcome dimension.
Scalability: This choice makes the library more scalable for high-dimensional targets.
Design Choice: I deliberately avoided using the treated units dimension for this purpose. While treated units are typically assumed to be independent, outcomes can be highly correlated. Using a distinct dimension allows us to model these correlations properly.
3. Visualization & Reporting
New Plotting Modes: Added two ways to visualize results: overlay (for direct comparison) and per_outcome (for individual analysis).
Selective Plotting: It is possible to filter which outcomes to display by passing a list to the outcomes_to_plot argument.
Enhanced Summary: The summary() output has been updated to provide a clear distinction between outcomes, making the statistical results easier to read.