GAM chapter (closes #775) by d-morrison · Pull Request #779 · d-morrison/rme

d-morrison · 2026-05-20T11:02:52Z

Summary

Initial draft of a new chapter on generalized additive models, addressing d-morrison/rme#775. Placed at the end of the GLMs part of the book; structure follows ISL Ch 7 progression with extensions from Wood 2017 for the penalized-likelihood / REML / inference content.

Chapter outline (11 sections)

Motivation — why nonlinear: age, BMI, temperature exposure-mortality examples.
Polynomial regression — definition, quadratic example, limitations (global behavior, no local control, unstable degree selection, extrapolation explodes).
Step functions — categorized predictors, when it's appropriate, when it isn't.
Basis functions — the unifying basis-expansion framework that polys and step functions both fit into.
Regression splines — piecewise polys, knots, smoothness constraints, truncated power basis, natural cubic splines, knot placement, splines::ns() / splines::bs().
Smoothing splines — penalized RSS, effective degrees of freedom, REML/GCV smoothness selection, regression-vs-smoothing comparison.
Local regression — LOESS, strengths, limitations, role as a diagnostic smoother.
GAMs — the additive model definition, interpretability, escape hatches (by =, te()).
Fitting in R — mgcv::gam(), s() arguments, REML vs GCV, gam.check(), plot.gam().
Inference — reading summary(), EDFs, smooth-term p-value caveats per @wood2013pvalues, AIC, when to prefer a GLM.
Worked example — logistic GAM on MASS::Pima.tr, from glm baseline → gam → partial-effect plots → response-scale plot.

New bib entries

@aune2016bmi — Aune et al. 2016 BMJ BMI/mortality meta-analysis
@gasparrini2015mortality — Gasparrini et al. 2015 Lancet temperature/mortality multi-country study
@wood2013pvalues — Wood 2013 Biometrika on smooth-term p-values

Caveats / known follow-ups

Worked example uses MASS::Pima.tr (built-in, no extra data dependency). The intro note flags that this should be swapped for a course-appropriate Epi 204 case study before the chapter is finalized.
No figures yet. The polynomial / step / spline / LOESS / GAM sections describe shapes but don't include code chunks that render comparison plots. A follow-up should add per-method illustration figures using a single shared synthetic dataset.
No exercises section. Most other rme chapters have an _exr-*.qmd exercises subfile; one should be added once the chapter content stabilizes.

Pre-commit checks

✅ quarto render chapters/generalized-additive-models.qmd --to html
✅ lintr::lint() on all 11 new subfiles (no lints)
✅ spelling::spell_check_package() (no errors)

Test plan

Render the chapter and verify all 11 sections appear under sensible headings.
Cross-reference checks: every @def-*, @exm-*, @thm-* label resolves.
Substantive content review (math correctness, citation accuracy, pedagogical flow).
Decide whether to swap MASS::Pima.tr for a course dataset in this PR or a follow-up.

🤖 Generated with Claude Code

d-morrison · 2026-05-20T11:21:36Z

@claude review

github-actions · 2026-05-20T11:21:49Z

👀 Picked up by workflow run #26159255775. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-05-20T16:58:41Z

@claude review

github-actions · 2026-05-20T16:58:54Z

👀 Picked up by workflow run #26177399193. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-05-20T19:52:38Z

Round-3 review came back with 12 items across 3 priority tiers, but on verification most don't match the current source. Per-item audit:

Hallucinated (not present in 7b2f8916):

(1) "Duplicate x = in labs()" — file has labs(x = "BMI (kg/m²)", y = "...", caption = "...") — one x = and one y =. No duplicate.
(2) "smoaths typo in _sec_gams.qmd" — heading actually reads "Adding interactions and tensor smooths" (correctly spelled).
(3) "Raw \beta / \varepsilon / \mathbb{E}" — none of these appear in any of the three named files. They were all fixed in the previous round (7b2f8916 macro-bug commit).
(4) "Missing #| code-fold: true on visible R chunks in worked example" — every chunk in _sec_gam_example.qmd is a prose code block (```r literal lowercase, not ```{r} executed). code-fold doesn't apply to non-executed display chunks.
(5) "Figures not wrapped in figure divs" — same reason: no figure-producing chunks in the worked example, just code listings.
(7) "##### headings in _sec_smoothing_splines.qmd" — no 5-hash headings anywhere in that file.
(12) "4-colon fences in _sec_basis_functions.qmd" — verified with re.match(r'^(:+)', ...): all fences are 3 colons.

Actionable (fixed in c48423183):

(6) The fig-method-comparison-data-fits chunk label was misleading because the chunk produces no figure (include: false — it fits models and stores predictions for the four panel chunks to plot). Renamed to method-comparison-data-fits (dropped fig- prefix).
(11) Added [@wood2017generalized, §5.4] locator to the select = TRUE description in _sec_gam_fitting.qmd.

Deferred (acknowledged in PR description):

(8) No exercises section.
(9) MASS::Pima.tr placeholder dataset.

Skipped per CLAUDE.md:

(10) "Page/figure locators for @aune2016bmi and @gasparrini2015mortality" — CLAUDE.md citation policy is "only cite a page number after verifying it directly from the source PDF." No PDFs accessible in this session; can be added in a follow-up once verified.

Triggering another review for confirmation.

d-morrison · 2026-05-20T19:52:39Z

@claude review

github-actions · 2026-05-20T19:52:51Z

👀 Picked up by workflow run #26186262395. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-05-20T23:48:23Z

@claude review

github-actions · 2026-05-20T23:48:34Z

👀 Picked up by workflow run #26196646462. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-05-21T01:52:52Z

@claude review

github-actions · 2026-05-21T01:53:04Z

👀 Picked up by workflow run #26200740705. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-05-21T01:59:40Z

@claude review

d-morrison · 2026-05-21T03:20:33Z

@claude review

github-actions · 2026-05-21T03:20:45Z

👀 Picked up by workflow run #26203430876. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

claude · 2026-05-27T04:23:57Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

claude · 2026-05-28T03:39:33Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

d-morrison · 2026-05-28T03:40:06Z

todo: change "Show R code" to "Show code"

claude · 2026-06-02T18:00:22Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

claude · 2026-06-03T16:43:54Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

d-morrison · 2026-06-03T22:03:20Z

@claude review

github-actions · 2026-06-03T22:03:33Z

👀 Picked up by workflow run #26915839558. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

d-morrison · 2026-06-03T22:26:47Z

Claude Code CLI (local session) is working on this — paws off until I'm done.

d-morrison · 2026-06-03T22:33:25Z

The latest @claude comments on this PR are in-progress stubs ("Claude Code is working…") that never posted a final verdict — the last completed review was 2026-05-26. I merged current main (8ac3b34) to re-trigger the pull_request review workflow against an up-to-date branch.

I re-audited every finding from the last completed review and the inline comments; all are already resolved in the current branch:

"two more constraints" → "four more constraints" — fixed (_sec_regression_splines.qmd:91).
ISLRv2 "Applied 6" → "Applied 7" — fixed (_exr-gams-applied.qmd:6).
Backfitting no-intercept bias — fixed: R uses coef(lm(r1 ~ x1))[2] + f1 - mean(f1); Python uses np.cov/np.var (_exr-gams-applied.qmd:118,144).
PolynomialFeatures bias column — fixed: include_bias=False (_exr-gams-applied.qmd:61).
Cross-refs inside code block — the @sec-smoothing-splines/@sec-gam-fitting note is now in prose, not a code chunk (_sec_lab.qmd:268).
plot(XX, …) 2D inconsistency and unused SplineTransformer import — fixed/removed (_sec_lab.qmd:252).
aune2016bmi missing pages — added pages = {i2156} (references.bib:1700).
Placeholder callout-warning visible to students — section is now wrapped in ::::: {.content-hidden} (_sec_gam_example.qmd:1).
Subfiles beginning with a heading — none do; all GAM subfiles open with ::: notes.

Fresh review now running on the merge commit.

github-actions · 2026-06-03T22:33:34Z

👀 Picked up by workflow run #26917221781. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

claude · 2026-06-03T22:33:35Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

github-actions · 2026-06-03T22:38:24Z

Now I have enough to write a thorough review.

Code Review — PR #779: GAM chapter (closes #775)

Overview

This PR adds a substantial new chapter on Generalized Additive Models (11 sections, 2,356 lines, 16 new files). The structure follows ISL Ch. 7, the content is mathematically sound, and the pedagogical flow is well-designed. The PR description is transparent about the known gaps (no course-specific dataset, incomplete exercises). Below are the findings.

Must-fix before merge

1. Nine subfiles are missing trailing newlines

Every new _subfiles/ file except _exr-gams-conceptual.qmd, _sec_gam_example.qmd, _sec_gam_fitting.qmd, _sec_gam_inference.qmd, and _sec_lab.qmd ends without a newline character (the diff shows \ No newline at end of file). The affected files:

_exr-gams-applied.qmd
_sec_basis_functions.qmd
_sec_gams.qmd
_sec_local_regression.qmd
_sec_motivation.qmd
_sec_polynomial.qmd
_sec_regression_splines.qmd
_sec_smoothing_splines.qmd
_sec_step_functions.qmd

POSIX text files must end in a newline. Add one to each.

2. gratia is used but not declared as a dependency

gratia::draw(), gratia::smooth_samples(), and gratia::compare_smooths() are referenced in _sec_gam_fitting.qmd, _sec_gam_inference.qmd, and _sec_gam_example.qmd. The package is not in DESCRIPTION (Suggests) and not in renv.lock. Either add it as a suggested dependency (renv::install("gratia"); renv::snapshot()) or qualify the prose with "if you install the optional gratia package…" The version note in _sec_gam_inference.qmd (gratia >= 0.9) also floats without being backed by a minimum version in DESCRIPTION.

Should-fix (convention violations)

3. Inconsistent code block syntax in _sec_gam_example.qmd

The worked example section uses plain backtick-r fences throughout (markdown syntax highlighting only), while every other code-containing subfile in this chapter uses {r} Quarto chunk syntax with #| eval: false. These are functionally equivalent when the chunk won't execute, but the inconsistency will confuse readers and break if someone adds #| eval: true later. Recommend converting all blocks to {r} + #| eval: false for consistency, or at minimum add a comment explaining the intentional choice.

4. .content-hidden hides the dataset warning from everyone

In _sec_gam_example.qmd:

:::: {.content-hidden}
::: callout-warning
This section is illustrative only …
:::
::::

{.content-hidden} removes the callout from the rendered HTML output, so neither students nor instructors see the warning. The intent (per PR description) is to flag that MASS::Pima.tr should be replaced with a course dataset. If the goal is an instructor-only reminder, this should be a :::notes block (which renders in presenter view) or a plain Quarto comment. If it's meant as a visible student notice, drop .content-hidden entirely.

Nice-to-have / minor

5. ## Setup {.unnumbered} placement inconsistency

_sec_gam_example.qmd places ## Setup {.unnumbered} inside the subfile. In contrast, for the lab section the parent chapter places ## Setup {.unnumbered} before the {{< include >}}. Neither pattern violates the convention (the rule only prohibits top-level # headings in subfiles), but for consistency across the chapter the ## Setup in the worked example could be moved to the parent.

6. s() argument table missing id column

_sec_gam_fitting.qmd has a useful s() arguments table, but id (linking smooths to share a smoothness parameter across predictors) is a commonly-used argument in multi-predictor epidemiologic models. Worth a one-line addition.

Content quality (no action needed, FYI)

The mathematics is correct throughout:

Cubic spline parameter count ($K+4$, reduced to $K$ by natural spline boundary constraints) ✓
Truncated power basis continuity proof in the conceptual exercise ✓
Smoothing-spline penalty limits ($\lambda \to 0$ interpolates, $\lambda \to \infty$ → OLS line) ✓
Backfitting convergence to OLS for linear smoothers ✓
EDF interpretation (per-smooth column excludes intercept, so linear smooth shows EDF ≈ 1, not 2) ✓
Confidence intervals computed on the link scale then back-transformed — correct and the reasoning is clearly explained ✓
method = "ML" required for AIC comparison with glm() — correct and the explanation is the right level of detail ✓

The section on tensor-product smooths (te()) correctly identifies the key point about non-comparable scales and contrasts with isotropic smooths.

Citation check: Both @me4 and @hastie2009esl2e resolve in references.bib — no phantom keys.

_{— posted by @claude post-step from workflow run}

claude · 2026-06-05T00:50:06Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

claude · 2026-06-06T08:25:28Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

Initial draft of a chapter on generalized additive models. Squashed from the original 69-commit history because the action's recursive submodule fetch was failing on commits that referenced a now-deleted latex-macros SHA. See the PR description for the chapter outline, new bib entries, and caveats.

claude · 2026-06-09T06:29:16Z

Claude finished @d-morrison's task in 5m 56s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read all changed files on current branch
Check which prior-round issues are resolved
Fresh review pass on all changed files
Post inline findings and summary

Resolved since 2026-06-03 review

The two code-correctness bugs flagged in prior rounds are confirmed fixed:

_sec_lab.qmd: plot(XX.ravel(), ...) — 2D array inconsistency fixed ✓
_sec_lab.qmd: SplineTransformer unused import removed ✓

Still open: must-fix

1. Nine subfiles missing trailing newlines

POSIX text files must end with \n. Every file below currently ends mid-content with no trailing newline — confirmed by od -An -tx1 tail -c 1:

File	Last byte
`_exr-gams-applied.qmd`	`3a` (`:`)
`_sec_basis_functions.qmd`	`2e` (`.`)
`_sec_gams.qmd`	`2e` (`.`)
`_sec_local_regression.qmd`	`2e` (`.`)
`_sec_motivation.qmd`	`2e` (`.`)
`_sec_polynomial.qmd`	`2e` (`.`)
`_sec_regression_splines.qmd`	`60` (`)
`_sec_smoothing_splines.qmd`	`2e` (`.`)
`_sec_step_functions.qmd`	`2e` (`.`)

Fix: add a single \n to the end of each file.

Fix trailing newlines →

2. `gratia` used in 2 files but not in `DESCRIPTION`

gratia is called in two subfiles but is absent from both Suggests: in DESCRIPTION and renv.lock:

_sec_gam_example.qmd:20 — library(gratia) in setup block; gratia::draw(m_gam) at line 116
_sec_gam_inference.qmd:77 — gratia::smooth_samples() (with version note >= 0.9); gratia::compare_smooths() at line 97

Without the declaration, R CMD CHECK will warn, the chapter won't render on a fresh environment, and students following the code will hit a missing-package error.

Fix: add gratia (>= 0.9) to Suggests: in DESCRIPTION and run renv::snapshot() to lock the version.

Fix DESCRIPTION →

Still open: should-fix

3. `.content-hidden` makes the dataset-placeholder warning invisible to everyone

_sec_gam_example.qmd:1–7

:::: {.content-hidden}
::: callout-warning
This section is illustrative only — the dataset is a built-in R
example, not an Epi 204 case study. …
:::
::::

{.content-hidden} removes the callout from all rendered output — HTML, slides, PDF. Neither students nor instructors see it. The intent (per PR description) is to flag the placeholder dataset; if it should only be visible during review, use a Quarto comment (``). If it's a student-visible reminder, drop .content-hidden. Either way, the current form defeats its own purpose.

Fix callout visibility →

4. Display-only R code blocks use markdown syntax instead of Quarto syntax

_sec_gam_example.qmd, _sec_gam_fitting.qmd, _sec_gam_inference.qmd, _sec_local_regression.qmd, _sec_polynomial.qmd, _sec_regression_splines.qmd, and _sec_smoothing_splines.qmd all show R code as plain ```r markdown blocks rather than ```{r} #| eval: false Quarto chunks. The rest of the chapter uses Quarto syntax.

The practical consequence: plain markdown blocks cannot carry #| code-fold: true, #| label:, or any Quarto chunk option, and behave differently in formats like revealjs. For sections that should never execute but should respond to Quarto chunk options, the recommended form is:

```{r}
#| eval: false
#| code-fold: true
… code …
```

Nice-to-have

5. `_sec_lab.qmd:36` — `# noqa: E402` on a commented-out import

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The # noqa: E402 annotation suppresses a linter warning for import-not-at-top-of-file. Since the entire import is commented out, the annotation does nothing — but it will puzzle readers who wonder why there's a linter suppression on dead code. Remove the # noqa or the whole line.

6. `s()` argument table missing `id` (`_sec_gam_fitting.qmd:43–50`)

The table lists bs, k, by, m, pc. The id argument (linking smooths to share a smoothness parameter across predictors) is a commonly-used mgcv feature in multi-predictor models and worth a one-line entry.

Summary

#	Severity	File	Line	Issue
1	Must-fix	9 subfiles	(end-of-file)	Missing trailing newlines — carry-over
2	Must-fix	`_sec_gam_example.qmd`, `_sec_gam_inference.qmd`	20, 116, 77, 97	`gratia` used but not in DESCRIPTION — carry-over
3	Should-fix	`_sec_gam_example.qmd`	1–7	`.content-hidden` hides placeholder warning from everyone — carry-over
4	Should-fix	7 subfiles	throughout	Display R blocks use markdown ```r not Quarto `{r}` — carry-over
5	Low	`_sec_lab.qmd`	36	Orphaned `# noqa: E402` on commented-out import — new
6	Low	`_sec_gam_fitting.qmd`	43–50	`s()` table missing `id` argument — carry-over

Items 1 and 2 are blockers for a clean build on a fresh environment. Items 3–6 are convention/clarity issues. The mathematical content and code correctness of the chapter remain sound.

…chapter

…deps - Add trailing newlines to 9 _subfiles/generalized-additive-models/ files. POSIX requires text files to end with a newline; many tools (cat, sed, diff, git's last-line tracking) misbehave without one. - Convert 7 subfiles' display-only ```r blocks to ```{r}\n#| eval: false so they render as code fences without trying to evaluate. Keeps the prose-illustration semantics but pulls the snippets into Quarto's code-display path (gets syntax highlighting + copy button + per-language config), instead of the plain Pandoc one. - _sec_gam_example.qmd: remove the outer .content-hidden div around the callout-warning. The whole point of the warning ("dataset is illustrative; swap for an Epi 204 case study") is to be visible to readers; hiding it defeats it. The unwrapped callout shows in HTML, slides, and PDF, which is the intended audience. - DESCRIPTION: declare gratia (>=0.9) and mgcv in Suggests. Both are loaded in the chapter (eval:false), and listing them lets renv resolve consistent versions when the chapter ever gets flipped to eval:true.

claude · 2026-06-09T08:02:44Z

Claude finished @d-morrison's task in 4m 49s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read all changed files on current branch
Check which prior-round issues are resolved
Fresh review pass on all changed files
Post findings and summary

Resolved since 2026-06-03 review (4 must-fix / should-fix items)

Commit 719594b ("GAM chapter cleanups: trailing newlines, eval:false, content-hidden, deps") addressed all four outstanding blockers:

#	File	What was fixed
1	9 `_subfiles/` files	Trailing newlines added — all files now end in `\n`
2	`DESCRIPTION`	`gratia (>= 0.9)` and `mgcv` declared in `Suggests:`
3	`_sec_gam_example.qmd:1–7`	`.content-hidden` wrapper removed; callout-warning is now visible in HTML, slides, and PDF
4	7 subfiles	Standalone display-only ```r blocks converted to ```{r} `#

Still open — three low-priority items

1. `_sec_lab.qmd:36` — Orphaned `# noqa: E402` on a commented-out import (carry-over)

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The E402 suppression is an import-order annotation. Since the entire import is commented out, the annotation silences a linter warning that can't fire — it confuses readers who wonder why dead code needs a linter bypass.

Fix: remove the # noqa: E402 suffix (or remove the whole commented line, since the active import from pygam import LinearGAM, s as gam_s appears at line 238 of the smoothing-spline block).

Fix →

2. `_sec_gams.qmd:95` — One ```r block missed by the cleanup commit (new)

The cleanup commit converted standalone display-only ```r blocks in 7 subfiles, but one indented single-liner inside a list bullet in _sec_gams.qmd was missed:

  the correct idiom is:
  ```r
  gam(y ~ sex + s(age, by = sex), data = d, method = "REML")


This is a short inline example embedded inside a bullet point. It still renders, but is inconsistent with the rest of the chapter (which now uses `` ```{r} `` throughout).

Fix: convert to `` ```{r} `` with `#| eval: false`.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gams.qmd%60%20line%2095%2C%20change%20the%20%60%60%60%60r%60%20fence%20to%20%60%60%60%60%7Br%7D%60%20and%20add%20%60%23%7C%20eval%3A%20false%60%20on%20the%20next%20line.%20This%20one-liner%20inside%20the%20predictor-by-factor-interactions%20list%20item%20was%20missed%20by%20the%20recent%20cleanup%20commit%20that%20converted%20standalone%20blocks%20in%207%20other%20subfiles.&repo=d-morrison/rme)

---

#### 3. `_sec_gam_fitting.qmd:44–50` — `s()` argument table missing `id` (carry-over)

The table lists `bs`, `k`, `by`, `m`, `pc`. The `id` argument (links two smooths to share a single smoothness parameter, e.g. `s(x1, id = 1) + s(x2, id = 1)`) is a commonly-used option in multi-predictor epidemiologic models and worth one row.

[Add `id` row →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gam_fitting.qmd%60%20in%20the%20%60s()%60%20argument%20table%20(lines%2044-50)%2C%20add%20a%20row%20for%20the%20%60id%60%20argument%3A%20it%20links%20two%20or%20more%20smooths%20to%20share%20a%20single%20smoothness%20parameter%20(e.g.%20%60s(x1%2C%20id%20%3D%201)%20%2B%20s(x2%2C%20id%20%3D%201)%60)%20and%20is%20a%20commonly%20used%20option%20in%20multi-predictor%20models.&repo=d-morrison/rme)

---

### Summary

| # | Severity | File | Line | Issue |
|---|----------|------|------|-------|
| 1 | Low | `_sec_lab.qmd` | 36 | Orphaned `# noqa: E402` on commented-out import — carry-over |
| 2 | Low | `_sec_gams.qmd` | 95 | One `` ```r `` block not converted to `` ```{r} `` — new |
| 3 | Low | `_sec_gam_fitting.qmd` | 44–50 | `s()` table missing `id` argument — carry-over |

All must-fix blockers from previous rounds are confirmed resolved. The three remaining items are all low-severity convention/completeness issues; none block rendering or correctness. The mathematical content and code correctness of the chapter remain sound.

Replace the built-in MASS::Pima.tr placeholder with the course's WCGS coronary-heart-disease cohort, addressing the review concern that the worked example was not an Epi 204 case study (and the student-visible 'illustrative only / replace dataset' callout-warning). - Outcome: incident CHD (chd69); smoothed continuous predictors: age, bmi, chol, sbp. - Load WCGS from the package's bundled inst/extdata/wcgs.rds. - Drop the placeholder callout-warning. - Adjust the narrative to WCGS reality (most smooths near EDF 1, cholesterol the predictor most likely to show curvature) without asserting specific run-dependent numbers. Chunks remain eval:false, consistent with the section's existing style and the book's other WCGS code listings. https://claude.ai/code/session_01NQSLJmjMN22RpwpgJ6Lw6u

claude · 2026-06-09T20:35:46Z

Claude finished @d-morrison's task in 5m 3s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read all changed files on current branch
Check which prior-round issues are resolved
Fresh review pass on all changed files
Post findings and summary

Resolved since 2026-06-09 review

The WCGS dataset commit (91b60e9) addressed the last open blocker from the 2026-06-03 review:

#	File	What was fixed
1	`_sec_gam_example.qmd`	`MASS::Pima.tr` replaced with WCGS (`wcgs.rds`); "illustrative only" callout removed; narrative updated for CHD outcome

Previous blockers confirmed still resolved from 719594b:

Trailing newlines on 9 subfiles ✓
gratia (>= 0.9) and mgcv in DESCRIPTION ✓ (confirmed at DESCRIPTION:49–54)
All standalone ```r display blocks converted to ```{r} #| eval: false ✓

wcgs.rds exists at inst/extdata/wcgs.rds ✓ — the system.file() call will resolve correctly.

Still open — three low-priority items (all carry-overs)

1. `_sec_lab.qmd:36` — Orphaned `# noqa: E402` on commented-out import

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The import is commented out; the linter annotation does nothing and puzzles readers. The active pygam import appears at line 238.

Fix →

2. `_sec_gams.qmd:95` — One ```r block missed by the cleanup commit

  the correct idiom is:
  ```r
  gam(y ~ sex + s(age, by = sex), data = d, method = "REML")


This indented inline example inside the `by = sex` bullet was missed by `719594b`. All other display-only R blocks in the chapter now use `` ```{r} #| eval: false ``.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gams.qmd%60%20line%2095%2C%20change%20the%20%60%60%60%60r%60%20fence%20to%20%60%60%60%60%7Br%7D%60%20and%20add%20%60%23%7C%20eval%3A%20false%60%20as%20the%20first%20line.%20This%20inline%20example%20inside%20the%20%60by%20%3D%20sex%60%20list%20item%20was%20missed%20by%20the%20recent%20cleanup%20commit%20that%20converted%20all%20other%20standalone%20blocks%20in%20this%20chapter.&repo=d-morrison/rme)

#### 3. `_sec_gam_fitting.qmd:44–50` — `s()` argument table missing `id`

The table lists `bs`, `k`, `by`, `m`, `pc`. The `id` argument (links two smooths to share a single smoothness parameter across predictors, e.g. `s(x1, id = 1) + s(x2, id = 1)`) is commonly used in multi-predictor epidemiologic models and worth one row.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gam_fitting.qmd%60%20in%20the%20%60s()%60%20argument%20table%20(lines%2044-50)%2C%20add%20a%20row%20for%20the%20%60id%60%20argument%3A%20%22links%20two%20or%20more%20smooths%20to%20share%20a%20single%20smoothness%20parameter%20(e.g.%20%60s(x1%2C%20id%20%3D%201)%20%2B%20s(x2%2C%20id%20%3D%201)%60)%3B%20useful%20when%20the%20same%20exposure%20appears%20in%20multiple%20smooths%22&repo=d-morrison/rme)

---

### Summary

| # | Severity | File | Line | Issue |
|---|----------|------|------|-------|
| 1 | Low | `_sec_lab.qmd` | 36 | Orphaned `# noqa: E402` on commented-out import — carry-over |
| 2 | Low | `_sec_gams.qmd` | 95 | `` ```r `` block not converted to `` ```{r} `` — carry-over |
| 3 | Low | `_sec_gam_fitting.qmd` | 44–50 | `s()` table missing `id` argument — carry-over |

All must-fix blockers from previous rounds are confirmed resolved: trailing newlines, `gratia` dependency, `.content-hidden` wrapper, and the WCGS dataset replacement. The mathematical content, code correctness, and pedagogical structure of the chapter are sound. The three remaining items are low-severity convention/completeness issues — none block rendering or correctness.

github-actions · 2026-06-09T20:58:36Z

PR Preview Action v1.8.1
🚀 Preview available at: https://d-morrison.github.io/rme/pr-preview/pr-779/

_{Built to branch gh-pages at 2026-06-09 20:58 UTC.}

…chapter Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

1. _sec_lab.qmd:36 — remove the # noqa: E402 trailing the commented- out pygam import. The import is already commented out, so the linter suppression is meaningless and just noise. 2. _sec_gams.qmd:95 — the inline `s(age, by = sex)` example was the one ```r block missed by the prior cleanup commit; convert to ```{r} #| eval: false to match the rest of the chapter. 3. _sec_gam_fitting.qmd: add the `id` argument to the s() argument table — used to share a single smoothness parameter across two or more smooths (e.g. s(x1, id = 1) + s(x2, id = 1)). Commonly needed in multi-predictor epi models. Also merged origin/main (the branch was 125 commits behind). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

claude · 2026-06-18T17:34:34Z

Claude finished @d-morrison's task in 4m 50s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read all changed files on current branch
Check which prior-round issues are resolved
Fresh review pass on all changed files
Post findings

Resolved since 2026-06-09 review (3/3)

All three low-priority items from the previous round are confirmed fixed:

#	File	What was fixed
1	`_sec_lab.qmd:36`	Orphaned `# noqa: E402` removed from commented-out import
2	`_sec_gams.qmd:95`	Inline ```r block inside list bullet converted to ```{r} with `#
3	`_sec_gam_fitting.qmd:49`	`id` row added to `s()` argument table

All must-fix blockers from prior rounds remain resolved:

Trailing newlines on 9 subfiles ✓ (0 "No newline at end of file" in current diff)
gratia (>= 0.9) and mgcv in DESCRIPTION:49–53 ✓
SplineTransformer unused import removed ✓
XX.ravel() used consistently in smoothing-spline Python plot ✓
Bare --- separators removed from both exercise subfiles ✓

New findings

1. (Very low) Commented-out import in setup block is now just dead code

_sec_lab.qmd:36

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f

The # noqa: E402 annotation was correctly removed, but the commented-out line itself remains. It imports LogisticGAM and f as gam_f, both of which appear in their own import statements within the chunks that use them (lines 293 and 350). The setup-block comment has no purpose and may puzzle students. Remove the whole line.

2. (Very low) Two remaining references to Pima data in illustration examples

_sec_gams.qmd:41 — #exm-gam still says "For the Pima diabetes data…" in a purely conceptual formula example (no code).
_sec_local_regression.qmd:40 — #exm-local-regression has loess(glu ~ age, data = MASS::Pima.tr, span = 0.5) as a quick LOESS syntax illustration.

These are not the main worked example (which was correctly updated to WCGS), and both use MASS::Pima.tr, which is always available from a CRAN package and never broken by a missing local file. Per the PR description, only the main worked example needed updating in this PR. Noting for follow-up completeness.

Summary

#	Severity	File	Line	Issue
1	Very low	`_sec_lab.qmd`	36	Commented-out import is now just dead code; remove it
2	Very low	`_sec_gams.qmd`, `_sec_local_regression.qmd`	41, 40	Remaining Pima references in conceptual illustration examples (low-risk, follow-up PR)

The chapter is otherwise clean. All prior must-fix and should-fix items are confirmed resolved. The mathematical content, code correctness, and pedagogical structure remain sound. The two remaining notes are housekeeping items that do not block merge.

1. _sec_lab.qmd:36 — drop the commented-out pygam import entirely. The active imports of LogisticGAM/gam_f live in the chunks that use them (lines 293, 350), so the setup-block comment was just dead code that puzzled students. 2. _sec_gams.qmd:38 — switch #exm-gam conceptual example from "Pima diabetes data with bmi/glu/age → diabetes" to "WCGS data with age/sbp/chol → chd69" so it matches the worked example's WCGS framing. 3. _sec_local_regression.qmd:34 — switch #exm-local-regression from loess(glu ~ age, MASS::Pima.tr) to loess(chol ~ age, wcgs) for the same reason; pivot the surrounding prose from age 40/glucose to age 50/cholesterol to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

claude · 2026-06-18T17:43:13Z

Claude finished @d-morrison's task in 4m 17s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read current state of changed files
Check which prior-round issues are resolved
Fresh review pass on all changed files
Post findings

Resolved since last review (2/2)

Both "very low" items from the 2026-06-18 review are confirmed fixed:

#	File	What was fixed
1	`_sec_lab.qmd:36`	Commented-out dead import removed entirely (no trace of the line or its `# noqa: E402` annotation)
2	`_sec_gams.qmd:38–51`, `_sec_local_regression.qmd:40–43`	Both remaining Pima references updated to WCGS: `exm-gam` now uses WCGS CHD example; LOESS example uses `loess(chol ~ age, data = wcgs, ...)`

All must-fix blockers from prior rounds remain resolved:

Trailing newlines: confirmed 0 "No newline at end of file" entries in the full diff ✓
gratia (>= 0.9) in DESCRIPTION:49 ✓
All display-only ```r blocks converted to ```{r} #| eval: false ✓
XX.ravel() used consistently in Python smoothing-spline plot ✓
Bare --- separators removed from exercise subfiles ✓
LOESS correctly described as lacking a parametric formula (not limited to training-point predictions) ✓
AIC comment gives the right reason (REML/ML scale mismatch, not fixed-effects structure) ✓

New finding (low priority)

`gratia` declared in `DESCRIPTION` but absent from `renv.lock`

DESCRIPTION:49 declares gratia (>= 0.9) in Suggests:, and mgcv is already in renv.lock (296 packages). gratia is not. This is expected behaviour for a Suggests-only dependency whose code all carries #| eval: false — renv::snapshot() only records packages that are actually executed in the project environment, so it will never add gratia automatically.

The practical consequence: renv::restore() on a fresh clone will not install gratia, and students who want to run the gratia::draw() / gratia::smooth_samples() code in _sec_gam_example.qmd and _sec_gam_inference.qmd will need to install it separately. Since the code is already annotated library(gratia) # tidy plots for gam fits (optional) and the DESCRIPTION declares the version floor, this is self-documenting.

If the project ever adds a rendered vignette or example that actually runs gratia code, a manual renv::install("gratia"); renv::snapshot() would be needed at that point. For now this is tracked-but-deferred.

Overall verdict

The chapter is clean. All must-fix and should-fix items from previous rounds are confirmed resolved; the single remaining note is a low-priority renv tracking issue that does not affect rendering, correctness, or student experience (given the (optional) annotation). The mathematical content, code correctness, and pedagogical structure of the 11-section chapter remain sound.

DESCRIPTION:49 declares gratia (>= 0.9) in Suggests, but renv::snapshot() doesn't auto-detect it because all gratia code lives in eval:false chunks. Force-record via renv::record('gratia') so renv::restore() on a fresh clone installs it. Other Suggests dependencies are already present in renv.lock (mgcv, ISLR2, boot, etc.), so this brings gratia into line with that pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

claude · 2026-06-18T17:52:02Z

Claude finished @d-morrison's task in 4m 31s —— View job

Code Review — GAM Chapter (PR #779)

Gather context from prior review rounds
Read all changed files on current branch
Confirm which prior-round issues are resolved
Fresh review pass on all changed files
Post findings

Resolved since last review (all 3 carry-overs)

Commits 225baa7, 7ad9aa8, and 46f37d8 addressed every remaining item:

#	File	What was fixed
1	`_sec_lab.qmd:36`	Commented-out `pygam` import removed entirely (commit `7ad9aa8`)
2	`_sec_gams.qmd:38–51`, `_sec_local_regression.qmd:34–51`	Both remaining Pima references updated to WCGS (commit `7ad9aa8`)
3	`renv.lock`	`gratia 0.11.2` recorded via `renv::record()` so `renv::restore()` installs it on a fresh clone (commit `46f37d8`)

All must-fix and should-fix items from prior rounds confirmed still resolved:

Trailing newlines: no \ No newline at end of file in any file's diff ✓
gratia (>= 0.9) and mgcv in DESCRIPTION:49,51 ✓
All display-only R blocks use ```{r} #| eval: false (including the one-liner in _sec_gams.qmd:95-98) ✓
XX.ravel() used consistently in Python smoothing-spline plot ✓
Bare --- separators removed from both exercise subfiles ✓
LOESS correctly described as lacking a parametric formula ✓
AIC comment gives the right reason (REML/ML scale mismatch) ✓
id argument row added to s() argument table ✓
# noqa: E402 orphaned annotation removed ✓

Citation audit

All six BibTeX keys introduced by this PR resolve in references.bib:
@aune2016bmi (line 1706, with pages = {i2156}), @gasparrini2015mortality (line 1716), @wood2013pvalues (line 1727), @james2023islp (line 1738), @wood2017generalized (line 797), @me4 (line 1245). ✓

Fresh review: no new issues found

The chapter is clean. All must-fix blockers and should-fix items from all prior rounds are confirmed resolved. The mathematical content, code correctness, and pedagogical structure of the 11-section chapter are sound.

github-actions Bot deleted a comment from claude Bot May 20, 2026

d-morrison added the clear freezer label May 20, 2026

github-actions Bot deleted a comment from claude Bot May 20, 2026

github-actions Bot deleted a comment from claude Bot May 21, 2026

d-morrison force-pushed the claude/issue-775-gam-chapter branch from 79a7d81 to ef26f15 Compare June 9, 2026 06:28

claude added 2 commits June 9, 2026 08:00

Merge remote-tracking branch 'origin/main' into claude/issue-775-gam-…

72c373e

…chapter

d-morrison and others added 2 commits June 18, 2026 10:32

Merge remote-tracking branch 'origin/main' into claude/issue-775-gam-…

2d9046d

…chapter Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>

Conversation

d-morrison commented May 20, 2026

Summary

Chapter outline (11 sections)

New bib entries

Caveats / known follow-ups

Pre-commit checks

Test plan

Uh oh!

d-morrison commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

d-morrison commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

d-morrison commented May 20, 2026

Uh oh!

d-morrison commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

d-morrison commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026

Uh oh!

d-morrison commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

d-morrison commented May 21, 2026

Uh oh!

d-morrison commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

claude Bot commented May 27, 2026

Uh oh!

claude Bot commented May 28, 2026

Uh oh!

d-morrison commented May 28, 2026

Uh oh!

claude Bot commented Jun 2, 2026

Uh oh!

claude Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Code Review — PR #779: GAM chapter (closes #775)

Overview

Must-fix before merge

Should-fix (convention violations)

Nice-to-have / minor

Content quality (no action needed, FYI)

Uh oh!

claude Bot commented Jun 5, 2026

Uh oh!

claude Bot commented Jun 6, 2026

Uh oh!

claude Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review — GAM Chapter (PR #779)

Resolved since 2026-06-03 review

Still open: must-fix

1. Nine subfiles missing trailing newlines

2. gratia used in 2 files but not in DESCRIPTION

Still open: should-fix

3. .content-hidden makes the dataset-placeholder warning invisible to everyone

claude Bot commented Jun 9, 2026 •

edited

Loading

2. `gratia` used in 2 files but not in `DESCRIPTION`

3. `.content-hidden` makes the dataset-placeholder warning invisible to everyone

5. `_sec_lab.qmd:36` — `# noqa: E402` on a commented-out import

6. `s()` argument table missing `id` (`_sec_gam_fitting.qmd:43–50`)

claude Bot commented Jun 9, 2026 •

edited

Loading

1. `_sec_lab.qmd:36` — Orphaned `# noqa: E402` on a commented-out import (carry-over)

2. `_sec_gams.qmd:95` — One ```r block missed by the cleanup commit (new)

claude Bot commented Jun 9, 2026 •

edited

Loading

1. `_sec_lab.qmd:36` — Orphaned `# noqa: E402` on commented-out import

2. `_sec_gams.qmd:95` — One ```r block missed by the cleanup commit

claude Bot commented Jun 18, 2026 •

edited

Loading

claude Bot commented Jun 18, 2026 •

edited

Loading

`gratia` declared in `DESCRIPTION` but absent from `renv.lock`

claude Bot commented Jun 18, 2026 •

edited

Loading