Skip to content

Methodology review fixes: England-only CT rebate, log-linear behavioural form, decile-specific elasticities#1

Merged
MaxGhenis merged 13 commits into
mainfrom
review-fixes
Apr 21, 2026
Merged

Methodology review fixes: England-only CT rebate, log-linear behavioural form, decile-specific elasticities#1
MaxGhenis merged 13 commits into
mainfrom
review-fixes

Conversation

@MaxGhenis

@MaxGhenis MaxGhenis commented Apr 20, 2026

Copy link
Copy Markdown
Contributor

Why

Methodology pass over the energy_shock package and dashboard.

Note: an earlier version of this branch attempted to migrate the
simulation layer to the unified policyengine Python API; that
change was reverted in 901b654 because policyengine 4.1.x's
bundled UK release manifest was pinned to a data-package version
without a published release_manifest.json on Hugging Face, blocking
bootstrap. The final diff retains policyengine-uk directly. The
migration can be revisited once upstream stabilises.

What shipped

Methodology

  • Decile-specific price elasticities. Each household's behavioural response is computed at its own income decile's short-run elasticity per Priesmann & Praktiknjo (2025), interpolated linearly from −0.64 (D1) to −0.11 (D10). A population-mean elasticity averages away the progressivity that matters.
  • Log-linear behavioural form. Spend response uses (p_new / p_old) ** (1 + ε) rather than the linear first-order approximation (1 + p)(1 + εp), which produces negative consumption — physically impossible — for combinations like ε = −0.64 and the +161% Q1 2023 peak. The log-linear form stays admissible at all ε ∈ (−1, 0] and p ≥ 0. Transferability and extrapolation caveats documented in README.md.
  • England-only CT rebate. ebr_council_tax_rebate in policyengine-uk pays any A–D band household including Scottish/Welsh/NI ones; the 2022 policy was England-only. Gated at the analysis layer in policy_ct_rebate, policy_post_shock, and _grouped_post_policy so aggregates match the real policy's geographic scope.

Code

  • Dead policy functions removed (in a follow-up commit after review): policy_epg, policy_wfa, policy_combined, policy_net_position were fully implemented but never called from generate.py. Dropped along with the now-unused EPG_TARGET / WFA_HIGHER / WFA_LOWER config constants. The dashboard consumes only policies.flat_transfer, policies.ct_rebate, and policy_post_shock.
  • Vectorised _build_household_type — old per-household Python loop replaced with pandas groupby + boolean masks.
  • Deps pinned in pyproject.toml with lower bounds: policyengine-uk>=2.88.0, microdf-python>=1.2.0, pandas>=2.0, numpy>=1.26.
  • CI workflow added (.github/workflows/pull_request.yaml) — runs pytest tests/. The unit suite is pure-numerical (no microsim bootstrap, no HF token required).
  • Unit tests for the decile-specific elasticity helpers in tests/test_elasticity.py: Priesmann endpoints, per-household decile lookup, missing-decile fallback, zero-price-change identity, low-decile cuts harder at +60 %, physical admissibility at +161 %.
  • Lean sdist[tool.hatch.build.targets.sdist] ships only the Python package, tests, README, LICENSE, and pyproject.toml; the ~10 MB dashboard/ subtree is excluded.

Dashboard

  • Dashboard.jsx prose / UI tweaks and regenerated results*.json / results_breakdowns*.json for UK + England / Scotland / Wales / Northern Ireland. Old results_v2*.json artefacts dropped.

Not changed

  • np.maximum(shocked - payment, baseline_energy) floor semantics kept; it matches the 'extra cost from shock' question.
  • EPG linear scaling across scenarios not reintroduced; can be added in a later PR if we want per-scenario PE reform runs.

Test plan

  • pytest tests/ passes on the decile-elasticity suite
  • python -m energy_shock --all-countries runs end-to-end (requires HUGGING_FACE_TOKEN)
  • Spot-check UK aggregate: baseline.mean_energy_spend close to prior value
  • Spot-check CT-rebate panels: Scotland / Wales / NI average benefit is £0; England > £0
  • Dashboard loads all five country JSONs

Generated with Claude Code

## Migration

Replaces raw `from policyengine_uk import Microsimulation` calls with
the unified `policyengine.py` API:

- `baseline._build_simulation` constructs a `policyengine.core.Simulation`
  with `tax_benefit_model_version=uk_latest` and passes `extra_variables`
  so the energy-specific columns (electricity_consumption, gas_consumption,
  region, accommodation_type) plus all policy reform outputs (epg_subsidy,
  ebr_*, winter_fuel_allowance) are materialised on `output_dataset.data`.
- `ensure_datasets` handles HF download + uprating to the target year,
  removing the manual URL passing.
- `_hh_array` replaces `sim.calculate(var, YEAR).values` across
  `sections.py` — accesses the output frame directly.
- Reform dicts in sections.py now build via `build_reform_simulation`;
  parameter paths are unchanged.

## Review fixes

Addresses findings from the methodology review:

1. `rising_block_tariff` post-shock was returning `{"deciles": []}`
   stubs. Finished the implementation — surcharge rate is now
   recomputed under each scenario so cost-neutrality is enforced at
   shock prices, and per-decile extra-cost rows are populated.

2. Added `ELASTICITY_BY_DECILE` (Priesmann & Praktiknjo 2025,
   −0.64 D1 → −0.11 D10 linear interp) to `config.py`. Documented
   the uniform-vs-differentiated distinction and the constant-
   elasticity-to-+161% caveat in the config docstring.

3. Pinned `policyengine`, `policyengine-uk`, `microdf-python`,
   `pandas`, `numpy` in `pyproject.toml`. Updated README install
   steps to `pip install -e .` now that deps are declared.

4. Vectorised `_build_household_type` — the previous Python loop
   over benunits / persons became pandas groupby + boolean masks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Apr 20, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
energy-price-shock Ready Ready Preview, Comment Apr 21, 2026 7:16pm

Request Review

Review findings follow-up:
- Pin `policyengine-uk` to the exact version the `policyengine` 4.1
  manifest requires (2.88.0). Floor was triggering a compatibility
  warning on every import.
- Bump `policyengine` floor to 4.1.0 now that 4.0 shipped and 4.1 is
  on PyPI (includes the extra_variables fix from PE.py#307).
- Wire `ELASTICITY_BY_DECILE` into `behavioral_responses` — each
  scenario now emits a `deciles_priesmann` list alongside the headline
  uniform-elasticity `deciles`, so readers can see the progressivity
  the uniform value suppresses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis

Copy link
Copy Markdown
Contributor Author

Subagent review flagged blockers; now resolved:

  1. extra_variables plumbingHonor Simulation.extra_variables on US/UK microsim paths (closes #303) policyengine.py#307 landed (merged by Max). Now released as policyengine==4.1.0 on PyPI. Simulation.extra_variables is honoured by the UK run() loop.
  2. policyengine>=4.0.0 PyPI availability — 4.1.0 is live.
  3. policyengine-uk pin — bumped to exact ==2.88.0 (the manifest-matching version) to silence the compatibility warning.
  4. Priesmann sensitivity — no longer config-only dead code. behavioral_responses now emits a deciles_priesmann array alongside the headline deciles. Uses income-differentiated −0.64 (D1) → −0.11 (D10) elasticities per Priesmann & Praktiknjo (2025).

Left for a later pass (not blocking):

  • Wiring deciles_priesmann into the dashboard as a sensitivity toggle.
  • Caveat UI text for the +161% extreme scenario (README has the caveat; the dashboard does not).

Couldn't verify end-to-end locally because importing policyengine now requires HF auth to fetch the data release manifest (DataReleaseManifestUnavailableError without HUGGING_FACE_TOKEN). CI should have the token set; if it doesn't, we'll know from the build logs.

## Analysis

Drop the uniform −0.15 elasticity in favour of Priesmann & Praktiknjo
(2025) income-differentiated values (−0.64 D1 → −0.11 D10). Every
behavioural-response calculation now operates on per-household
elasticities, so low-income households correctly cut consumption more
sharply and the progressivity of a shock flows through to output.

Affected:
- `behavioural_responses` (renamed)
- `policy_net_position`, `policy_post_shock`, NEG, RBT scenarios
- `_grouped_post_policy` and `_grouped_shock` helpers — now take a
  per-household behavioural factor array and aggregate within groups,
  reflecting each group's actual decile composition
- Headline scenario output now reports `elasticity_by_decile` and
  `mean_elasticity` (weighted mean from the dataset) instead of a
  single hardcoded number

## Spelling

British English throughout internal identifiers, JSON output keys,
and dashboard: `behavioral` → `behavioural`. Existing results JSON
files migrated in-place so the dashboard keeps rendering until the
next full regeneration.

## Packaging

Upgrade to a proper Python package:

- `pyproject.toml` uses hatchling, declares authors, license (AGPL-3),
  classifiers, URLs, `[project.scripts]` entry point, `[dev]` extra
  (pytest, ruff), ruff + pytest config.
- `energy_shock/__init__.py` exposes `__version__` via importlib
  metadata and the public run_all / run_all_countries entry points.
- Lazy-import `policyengine` inside `baseline._load_dataset` /
  `_build_simulation` so importing `energy_shock` for pure helpers
  doesn't require an HF token.

## Tests + CI

- `tests/test_elasticity.py` covers decile mapping, missing-decile
  fallback, zero-price identity, and low-vs-high-decile monotonicity.
  Pure-function unit tests — no microsim, no dataset download.
- `.github/workflows/pull_request.yaml` runs ruff format + lint,
  pytest, and dashboard Vite build on every PR + push to main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d, prose sync, LICENSE

Physics fix:
Switch _behavioural_factor_hh from the linear first-order approximation
(1+p)(1+eps*p) to the canonical constant-elasticity form (1+p)^(1+eps).
The linear form produces a negative spending factor for combinations
like eps = -0.64 and p = +1.61 — implying consumption cuts > 100 %,
which is physically impossible and was being rendered as-is in the
Q1 2023 peak KPIs.

Log-linear stays admissible for all eps in (-1, 0] and p >= 0, and
collapses to the linear approximation for small p. New regression
test test_behavioural_factor_physically_admissible_at_extreme_shock
asserts positivity across the full Priesmann range at +161 %.

Stale dashboard data guard:
The JSONs under dashboard/src/data/ still held uniform-0.15 numbers
from the old run, textually migrated to British spelling but not
regenerated. Rendering would show the OLD analysis with NEW labels.
Replaced each JSON with a stub carrying _stale: true. Dashboard
checks for the flag at top level and renders a Data generation
pending banner with the regen command instead of the full panels,
so the site never shows misleading numbers until someone regenerates
with HUGGING_FACE_TOKEN set.

Dashboard prose:
Replace Labandeira-uniform language in three places with the
Priesmann decile-specific account and cite the constant-elasticity
form. Remove the elasticity = -0.15 KPI tooltip; add a Mean
elasticity KPI that reads from results.behavioural.mean_elasticity.

README sync:
- Add HUGGING_FACE_TOKEN requirement to install steps
- Swap uniform-elasticity description for the decile-specific account
- Update Requirements and Tech stack to match ==2.88.0 pin
- Add pytest section

Licensing + packaging:
- Add LICENSE (AGPL-3.0-or-later)
- uvx ruff check --fix --unsafe-fixes . + ruff format . clean
- Swap dashboard-build CI job from npm to bun; bun lockfile committed,
  npm lockfile removed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
policyengine.py 4.1.0 can't boot without upstream fixes to its
bundled UK release manifest (the pinned data-package version doesn't
have a release_manifest.json published on Hugging Face, and the
bundled-certification fallback requires an exact model-version
string match that the installed 2.88.0 hits but the manifest fetch
short-circuits before getting there).

Revert the simulation layer to raw policyengine_uk.Microsimulation —
exact same reform-dict shape, exact same output values, just without
the manifest bootstrap drama. Migration to policyengine.py can be
revisited once upstream stabilises.

Regenerate all 10 dashboard JSON files with the new decile-specific
elasticity + log-linear constant-elasticity behavioural model:

  UK:                32.0 M households, mean energy £1,496
  England:           26.9 M,            £1,498
  Scotland:           2.7 M,            £1,479
  Wales:              1.6 M,            £1,363
  Northern Ireland:   0.7 M,            £1,742

Spot-check +60% behavioural hit by decile:

  D1  (ε -0.640):  £181  (31 % of static)
  D5  (ε -0.404):  £537  (54 %)
  D10 (ε -0.110): £1,282 (87 %)

The previously flat +45.6 % behavioural line under the uniform
−0.15 model is now a steep progressive slope — the Priesmann
progressivity that the old model suppressed is now the headline.

Q1 2023 peak (+161 %) behavioural is now a defensible +£1,568 avg
instead of the broken negative values the linear form was producing.

pyproject.toml drops the policyengine package from direct deps
(still transitively reachable if anyone wants to try the unified
API manually).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…consumption-pct, README, LICENSE

Four codex-flagged blockers resolved:

1. CI Test job: setup-uv without python-version doesn't provide a
   system Python, so `uv pip install --system` fails. Replace with
   `uv venv --python 3.13 .venv && uv pip install --python .venv/bin/python`
   and run pytest from the venv.

2. Council Tax Rebate was paying non-English households because the
   PE-UK formula keys off council_tax_band alone (non-NONE outside
   England in the FRS imputation). Real 2022 policy was England-only.
   Add `_england_mask_unfiltered(sim, cmask)` and zero out non-English
   rows in all three call sites (`policy_ct_rebate`, `policy_post_shock`
   via `hh_payments`, and `policy_combined`).

   Verified post-regen: Scotland / Wales / Northern Ireland now show
   £0 CT rebate; England at £6.5 bn (down from £7.7 bn UK-wide).

3. `consumption_reduction_pct` in the behavioural output was still
   using the linear first-order `eps_d * price_pct` instead of the
   log-linear `(1+p)^eps - 1`. At +161 % for D1 it reported
   `-102.9 %`, physically impossible. Fixed to the log-linear form;
   now D1 Q1-2023-peak reduction reports `-45.8 %`, matching
   `(1 + 1.61) ** −0.64 = 0.5412`.

4. README was still describing the reverted policyengine.py migration
   path and showed npm install commands. Updated to match the raw
   policyengine_uk implementation and the bun-based dashboard tooling
   the CI workflow uses.

Plus two codex nits:

- LICENSE replaced with canonical 661-line AGPL-3.0 text (was an
  abridged preface-only file that wouldn't satisfy GitHub's license
  scanner).
- Tech-stack line sync'd with pyproject.toml.

Regenerated all 10 dashboard JSONs. Ruff clean, 6 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One blocker and three nits from codex round 2:

1. Public-facing copy inconsistent with England-only CT rebate:
   - README.md:10: add ", England only (mirrors the 2022 Council Tax
     Rebate's geographic scope)" to the rebate bullet.
   - Dashboard POLICY_META description: reworded to note VOA's ~63 %
     A-D headline and explain why PolicyEngine's FRS-based band
     imputation can overshoot that in the model output.
   - Dashboard "Key findings" summary: qualify the rebate as
     "England only".
   - sections.py JSON policy description string: include the
     England-only qualifier so downstream consumers of the JSON
     aren't misled.

2. filter_by_country() replaced row-level data["country"] with a
   scalar string, which quietly broke the shape contract (downstream
   groupby / boolean masking by country assumed an array). Keep
   data["country"] as a per-household array; move the scalar filter
   name to a new data["country_label"]. Tested against all four
   constituent nation outputs.

3. CI bun install fallback (bun install --frozen-lockfile || bun
   install) masked lockfile drift. Drop the fallback so CI fails
   deterministically if bun.lock and package.json go out of sync.

Regenerated all 10 dashboard JSONs after the filter_by_country fix,
reran ruff + pytest + bun build — all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
policy_net_position() isn't wired into generate.py today, but if it
gets revived it would reintroduce the non-English CT-rebate bug.
Apply the same England-only mask as the other CT-rebate call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four blockers and five material concerns from the Opus 4.7 publication
review.

B1. Rename `results_v2*.json` → `results_breakdowns*.json`. The `_v2`
    suffix violates Max's global naming rule ("use git for history");
    `breakdowns` describes what the payload contains (electricity/gas
    split, tenure, household type, country, NEG). All 5 JSONs renamed,
    Dashboard.jsx imports + destructuring updated
    (resultsV2 → breakdowns, ALL_DATA.<country>.v2 → .breakdowns),
    generate.py output path + log lines updated.

B2. Sdist allowlist in pyproject.toml. Without it, hatchling included
    the entire repo — dashboard subtree (bundled JSONs + React source)
    inflated the source distribution by ~10 MB. Restrict to
    `energy_shock/**/*.py`, `tests/**/*.py`, README, LICENSE,
    pyproject.toml. Built tarball is now 30 KB with 13 files.

B3. Drop unused code paths from dashboard output:
    - `policy_post_shock.epg` was shipped but never selected in the
      dashboard (POLICY_META only exposes flat_transfer, ct_rebate,
      bn_transfer, bn_epg, neg). Removed `epg` branch + pe_policies
      entry.
    - `policy_post_shock.rbt` (rising-block-tariff-under-shock scenario
      loop, ~100 lines) also never consumed. Removed.
    - Standalone `rising_block_tariff()` in sections.py + its call in
      generate.py + `rising_block_tariff` key in breakdowns output —
      all unused. Removed.
    - `RBT_DISCOUNT_RATE` constant no longer needed. Dropped.
    - Results JSON files now 40% smaller (327 KB → 196 KB per country).

B4. `welfare_loss_comfort_avg` (Harberger-triangle approximation,
    `0.5 * |ε| * p^2 * energy`) was inconsistent with the constant-
    elasticity demand used everywhere else — overstates welfare loss
    3× at the +161 % scenario. The field was never rendered in the
    dashboard, so dropped rather than replaced with the exact
    log-linear CS integral. Also removed a dead `energy * price_pct`
    expression statement that the same block carried.

M1. Added Priesmann transferability caveats to README methodology
    section and dashboard methodology prose: the paper estimates from
    German *gas* demand, linear interp between D1 and D10 is a
    convenience, and UK electricity is typically less elastic than
    gas — so behavioural bill savings should be read as an upper
    bound on consumer adjustment.

M2. Prose correction in 3 places (KPI info tooltips + methodology
    paragraph + intro): "imputed from NEED 2023" → "imputed from
    Living Costs and Food Survey, calibrated against NEED 2023
    administrative totals". NEED is an aggregate admin dataset, not
    household-level; LCFS is the household survey PolicyEngine uses
    for energy consumption imputation.

M4. Removed the `DATA_STALE` guard + "Data generation pending" stub
    screen from Dashboard.jsx. It tested for `_stale: true` on
    bundled JSONs, but no code in the package ever emits that key —
    pure dead code.

M5. README setup instructions and generate.py module docstring
    switched from `conda activate python313` / `pip install -e .` to
    `uv venv --python 3.13 .venv && source .venv/bin/activate` /
    `uv pip install -e .` to match the CI install path and Max's
    standard Python tooling.

Regenerated all 10 dashboard JSONs under the new code; reran ruff
check + format, pytest (6/6), sdist build, bun install + build — all
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four policy functions in energy_shock/sections.py were fully implemented
but never called from generate.py's _run_one and therefore produced no
output in any of the dashboard JSONs: policy_epg, policy_wfa,
policy_combined, policy_net_position. Remove them so the module
accurately reflects the analyses the dashboard actually consumes, and
drop the now-unused EPG_TARGET / WFA_HIGHER / WFA_LOWER constants from
config.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vahid-ahmadi vahid-ahmadi changed the title Migrate to policyengine.py API; finish RBT post-shock; pin deps Methodology review fixes: England-only CT rebate, log-linear behavioural form, decile-specific elasticities Apr 21, 2026
Round-4 fresh-team referee report surfaced blockers the earlier Codex
rounds missed. Addressed here:

Methodology / practitioner
  • Post-policy "extra cost" metric was `max(shocked_e − p, e)`,
    clipping the residual at baseline. Low deciles under shock-matching
    or flat transfers were in fact over-compensated (e.g. D1 flat
    transfer at +10 %: residual £10 but average household is £302
    better off than baseline). Now emit a signed `net_change` and
    `behavioural_net_change` field per decile and per group alongside
    the existing clipped `extra_cost`. Dashboard tooltip now shows both
    so the progressivity story lands.
  • Section-description prose explicitly explains the residual vs
    net-change semantics so readers don't misread the chart as "every
    decile still hurt".
  • NEG subsidy base kept static (pre-shock consumption up to 2,900
    kWh), but the choice is now documented in both `neg_policy`'s
    docstring and the policy description card. An alternative indexing
    to actual post-response consumption would shrink subsidy cost for
    high-elasticity low-income deciles that cut below threshold; noted
    as a design caveat.
  • Deleted dead code the earlier review rounds had overlooked:
    `policy_epg`, `policy_wfa`, `policy_combined`, `policy_net_position`,
    `regional_breakdown`, `gas_price_cap`. None were called from
    `generate.py` or consumed by the dashboard. Also removed
    `EPG_TARGET`, `WFA_HIGHER`, `WFA_LOWER` constants and the
    `RBT_DISCOUNT_RATE` leftover. Net cut: ~450 lines of sections.py.
  • Removed leftover unused tuple / bare-expression statements in
    `baseline_summary` and `shock_scenarios`.
  • Strengthened `test_behavioural_factor_physically_admissible_at_extreme_shock`
    to assert explicitly that the linear first-order form is negative
    at (ε=−0.64, p=1.61) and that the returned factor is NOT close to
    that linear form — guards against future regression to the
    quadratic Taylor approximation. Added
    `test_epsilon_fallback_honours_weights` covering the non-uniform
    weighting path.

Citations (URLs now resolve to the claimed source)
  • fn-2 Cornwall Insight: swap to the 4 March 2026
    july-price-cap-forecast-rises-to-1800 release (the previous URL
    went to a May 2025 £1,720 announcement).
  • fn-3 Stifel/£2,500: swap to the GB News
    energy-bills-uk-households-iran-gas-prices article (the previous
    URL went to an unrelated £160 Cornwall Insight piece).
  • fn-4 Resolution Foundation: swap to the £480 comment piece (the
    previous URL went to an earlier £500 press release).
  • fn-8: clarify that £4,279 is the *announced* Q1 2023 cap — the
    concurrent Energy Price Guarantee held typical bills at £2,500.

Domain / scope
  • README and dashboard methodology now flag (a) the gas-vs-combined
    shock simplification, (b) the ~£290/yr of fixed standing charges
    that a uniform-percentage shock rescales, and (c) the +161 %
    scenario being an illustrative stress-test not a realised episode.
  • NEG threshold 2,900 kWh is now attributed to Bangham (2026)
    mirroring the 2022 Austria/Netherlands relief design, not a "UK
    median" (the Ofgem TDCV is closer to 2,700 kWh).

Reference wiring
  • `Labandeira et al. (2017)` in-text mention now links to its
    reference entry.
  • fn-12 (EPG) linked from the cap-freeze subsidy description.
  • fn-13 (HM Treasury factsheet) linked from the methodology intro.

UI polish
  • Inline "illustrative scenario" banner on the +161 % scenario's KPI
    panel in both the Impact and Policy responses tabs.
  • "Shock-match" label unified to "Shock-matching" across the Policy
    responses tab (matches the prose and POLICY_META.fullName).

Reproducibility
  • `vercel.json` now uses `bun install --frozen-lockfile && bun run
    build` to match CI's deterministic install path (was `npm install`,
    which synthesised a lockfile on the fly).
  • CI test job now runs a Python 3.13 + 3.14 matrix (the classifier
    advertised 3.14 but only 3.13 was exercised).
  • Removed empty `tests/__init__.py` (pytest auto-discovers).

Regenerated all 10 dashboard JSONs under the slimmer code. ruff check
+ format, pytest (7/7), bun dashboard build, sdist build (30 KB, 13
files) all clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex round-5 blocker and nits on top of d41a760.

1. Policy-cost chart contradicted itself under the behavioural toggle.
   `getExchequer` shrank shock-matching and NEG by the whole-population
   behavRatio even though neither policy scales with consumer response:

   - Shock-matching is a flat per-household payment pegged to the
     *static* average shock (`sections.py` `policy_key == "bn_transfer"`),
     so aggregate cost is fixed once the payment is set.
   - NEG subsidy is explicitly indexed to each household's pre-shock
     consumption up to 2,900 kWh (documented in the `neg_policy`
     docstring and the `subsidy_indexed_to` JSON field), so aggregate
     cost does not depend on whether households cut consumption.

   Only the cap-freeze subsidy (`bn_epg`) reimburses each household's
   actual bill increase, so it alone scales with the behavioural /
   static ratio. `getExchequer` now reflects that. The Policy-at-a-
   glance chart no longer disagrees with the selected-policy panel at
   `dashboard/src/components/Dashboard.jsx:725`.

2. Tooltip display polish:
   - Negative net-change values now format as `-£366` rather than
     `£-366`.
   - Metric label renamed from "net change" to "net change vs
     baseline" so the reference point is explicit.

3. `consumption_reduction_pct` in the `behavioural` section reported a
   negative number even though the field name reads as a positive
   reduction ("D1 loses 27%"). Flipped the sign: at +10% the D1
   elasticity now reports a 5.9% reduction; at the +161% peak, 45.8%.
   The comment block was correct all along — the code was off.

Regenerated all 10 dashboard JSONs; ruff + pytest (7/7) + bun dashboard
build all pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex round-6 caught that the per-policy KPI panel at
`Dashboard.jsx:725` still hardcoded static values for cap-freeze
(bn_epg). Under the behavioural toggle at +60%, the panel therefore
reported benefit ≈£1,045/yr and cost ≈£33.4bn even though the
response-consistent numbers are ≈£637/yr and ≈£20.4bn.

Cap-freeze (bn_epg) is the only modelled policy whose payment scales
with the consumer response — the government reimburses each
household's *actual* bill increase, which falls when households cut
consumption. Shock-matching, NEG, flat transfer and CT rebate all pay
an amount that is invariant to behavioural response, so their KPI
values correctly stay static.

`getExchequer` on the overview chart already applies `behavRatio` to
`bn_epg` only; this commit aligns the selected-policy KPI panel with
that logic so the chart and the card agree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@MaxGhenis MaxGhenis merged commit 85249c4 into main Apr 21, 2026
6 checks passed
@MaxGhenis MaxGhenis deleted the review-fixes branch April 21, 2026 19:59
vahid-ahmadi added a commit to PolicyEngine/og-model-dashboard that referenced this pull request Apr 28, 2026
Applies the same review lens as PolicyEngine/energy-price-shock#1.

Methodology / correctness:
- TransitionMacroImpact has no r_baseline / r_reform fields. The
  Code-tab block under map_transition_to_real_world wrongly listed
  them; replaced with the actual interest-rate paths on the TPI
  results (base_tp.r, reform_tp.r).
- impact.years on TransitionMacroImpact is fiscal-year strings
  ("2026-27", ..., not ints). Updated the inline comment.
- Default TPI T is 60 (oguk_default_parameters.json), not 30. Fixed
  the Code-tab header, transition-path narrative, the Example-tab
  description, and the Methodology SVG label.
- Flagged the steady-state terminal output as illustrative — the
  numbers were plausible but not from a real model run.

CI:
- New .github/workflows/checks.yml runs node --check on the inline
  JS, verifies every locally-referenced asset exists, and asserts
  the four oguk API symbols the Code tab cites are still exported
  upstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants