Skip to content

GAM chapter (closes #775)#779

Open
d-morrison wants to merge 8 commits into
mainfrom
claude/issue-775-gam-chapter
Open

GAM chapter (closes #775)#779
d-morrison wants to merge 8 commits into
mainfrom
claude/issue-775-gam-chapter

Conversation

@d-morrison

Copy link
Copy Markdown
Owner

Summary

Initial draft of a new chapter on generalized additive models, addressing d-morrison/rme#775. Placed at the end of the GLMs part of the book; structure follows ISL Ch 7 progression with extensions from Wood 2017 for the penalized-likelihood / REML / inference content.

Chapter outline (11 sections)

  1. Motivation — why nonlinear: age, BMI, temperature exposure-mortality examples.
  2. Polynomial regression — definition, quadratic example, limitations (global behavior, no local control, unstable degree selection, extrapolation explodes).
  3. Step functions — categorized predictors, when it's appropriate, when it isn't.
  4. Basis functions — the unifying basis-expansion framework that polys and step functions both fit into.
  5. Regression splines — piecewise polys, knots, smoothness constraints, truncated power basis, natural cubic splines, knot placement, splines::ns() / splines::bs().
  6. Smoothing splines — penalized RSS, effective degrees of freedom, REML/GCV smoothness selection, regression-vs-smoothing comparison.
  7. Local regression — LOESS, strengths, limitations, role as a diagnostic smoother.
  8. GAMs — the additive model definition, interpretability, escape hatches (by =, te()).
  9. Fitting in Rmgcv::gam(), s() arguments, REML vs GCV, gam.check(), plot.gam().
  10. Inference — reading summary(), EDFs, smooth-term p-value caveats per @wood2013pvalues, AIC, when to prefer a GLM.
  11. Worked example — logistic GAM on MASS::Pima.tr, from glm baseline → gam → partial-effect plots → response-scale plot.

New bib entries

  • @aune2016bmi — Aune et al. 2016 BMJ BMI/mortality meta-analysis
  • @gasparrini2015mortality — Gasparrini et al. 2015 Lancet temperature/mortality multi-country study
  • @wood2013pvalues — Wood 2013 Biometrika on smooth-term p-values

Caveats / known follow-ups

  • Worked example uses MASS::Pima.tr (built-in, no extra data dependency). The intro note flags that this should be swapped for a course-appropriate Epi 204 case study before the chapter is finalized.
  • No figures yet. The polynomial / step / spline / LOESS / GAM sections describe shapes but don't include code chunks that render comparison plots. A follow-up should add per-method illustration figures using a single shared synthetic dataset.
  • No exercises section. Most other rme chapters have an _exr-*.qmd exercises subfile; one should be added once the chapter content stabilizes.

Pre-commit checks

  • quarto render chapters/generalized-additive-models.qmd --to html
  • lintr::lint() on all 11 new subfiles (no lints)
  • spelling::spell_check_package() (no errors)

Test plan

  • Render the chapter and verify all 11 sections appear under sensible headings.
  • Cross-reference checks: every @def-*, @exm-*, @thm-* label resolves.
  • Substantive content review (math correctness, citation accuracy, pedagogical flow).
  • Decide whether to swap MASS::Pima.tr for a course dataset in this PR or a follow-up.

🤖 Generated with Claude Code

@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26159255775. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26177399193. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@d-morrison

Copy link
Copy Markdown
Owner Author

Round-3 review came back with 12 items across 3 priority tiers, but on verification most don't match the current source. Per-item audit:

Hallucinated (not present in 7b2f8916):

  • (1) "Duplicate x = in labs()" — file has labs(x = "BMI (kg/m²)", y = "...", caption = "...") — one x = and one y =. No duplicate.
  • (2) "smoaths typo in _sec_gams.qmd" — heading actually reads "Adding interactions and tensor smooths" (correctly spelled).
  • (3) "Raw \beta / \varepsilon / \mathbb{E}" — none of these appear in any of the three named files. They were all fixed in the previous round (7b2f8916 macro-bug commit).
  • (4) "Missing #| code-fold: true on visible R chunks in worked example" — every chunk in _sec_gam_example.qmd is a prose code block (```r literal lowercase, not ```{r} executed). code-fold doesn't apply to non-executed display chunks.
  • (5) "Figures not wrapped in figure divs" — same reason: no figure-producing chunks in the worked example, just code listings.
  • (7) "##### headings in _sec_smoothing_splines.qmd" — no 5-hash headings anywhere in that file.
  • (12) "4-colon fences in _sec_basis_functions.qmd" — verified with re.match(r'^(:+)', ...): all fences are 3 colons.

Actionable (fixed in c48423183):

  • (6) The fig-method-comparison-data-fits chunk label was misleading because the chunk produces no figure (include: false — it fits models and stores predictions for the four panel chunks to plot). Renamed to method-comparison-data-fits (dropped fig- prefix).
  • (11) Added [@wood2017generalized, §5.4] locator to the select = TRUE description in _sec_gam_fitting.qmd.

Deferred (acknowledged in PR description):

  • (8) No exercises section.
  • (9) MASS::Pima.tr placeholder dataset.

Skipped per CLAUDE.md:

  • (10) "Page/figure locators for @aune2016bmi and @gasparrini2015mortality" — CLAUDE.md citation policy is "only cite a page number after verifying it directly from the source PDF." No PDFs accessible in this session; can be added in a follow-up once verified.

Triggering another review for confirmation.

@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26186262395. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 20, 2026
@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26196646462. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26200740705. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@github-actions

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26203430876. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@github-actions github-actions Bot deleted a comment from claude Bot May 21, 2026
@claude

claude Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@claude

claude Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@d-morrison

Copy link
Copy Markdown
Owner Author

todo: change "Show R code" to "Show code"

@claude

claude Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@claude

claude Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@d-morrison

Copy link
Copy Markdown
Owner Author

@claude review

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26915839558. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@d-morrison

Copy link
Copy Markdown
Owner Author

Claude Code CLI (local session) is working on this — paws off until I'm done.

@d-morrison

Copy link
Copy Markdown
Owner Author

The latest @claude comments on this PR are in-progress stubs ("Claude Code is working…") that never posted a final verdict — the last completed review was 2026-05-26. I merged current main (8ac3b34) to re-trigger the pull_request review workflow against an up-to-date branch.

I re-audited every finding from the last completed review and the inline comments; all are already resolved in the current branch:

  • "two more constraints" → "four more constraints" — fixed (_sec_regression_splines.qmd:91).
  • ISLRv2 "Applied 6" → "Applied 7" — fixed (_exr-gams-applied.qmd:6).
  • Backfitting no-intercept bias — fixed: R uses coef(lm(r1 ~ x1))[2] + f1 - mean(f1); Python uses np.cov/np.var (_exr-gams-applied.qmd:118,144).
  • PolynomialFeatures bias column — fixed: include_bias=False (_exr-gams-applied.qmd:61).
  • Cross-refs inside code block — the @sec-smoothing-splines/@sec-gam-fitting note is now in prose, not a code chunk (_sec_lab.qmd:268).
  • plot(XX, …) 2D inconsistency and unused SplineTransformer import — fixed/removed (_sec_lab.qmd:252).
  • aune2016bmi missing pages — added pages = {i2156} (references.bib:1700).
  • Placeholder callout-warning visible to students — section is now wrapped in ::::: {.content-hidden} (_sec_gam_example.qmd:1).
  • Subfiles beginning with a heading — none do; all GAM subfiles open with ::: notes.

Fresh review now running on the merge commit.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

👀 Picked up by workflow run #26917221781. R/Quarto/renv setup runs first (~3-5 min); Claude itself responds after that.

@claude

claude Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Now I have enough to write a thorough review.


Code Review — PR #779: GAM chapter (closes #775)

Overview

This PR adds a substantial new chapter on Generalized Additive Models (11 sections, 2,356 lines, 16 new files). The structure follows ISL Ch. 7, the content is mathematically sound, and the pedagogical flow is well-designed. The PR description is transparent about the known gaps (no course-specific dataset, incomplete exercises). Below are the findings.


Must-fix before merge

1. Nine subfiles are missing trailing newlines

Every new _subfiles/ file except _exr-gams-conceptual.qmd, _sec_gam_example.qmd, _sec_gam_fitting.qmd, _sec_gam_inference.qmd, and _sec_lab.qmd ends without a newline character (the diff shows \ No newline at end of file). The affected files:

  • _exr-gams-applied.qmd
  • _sec_basis_functions.qmd
  • _sec_gams.qmd
  • _sec_local_regression.qmd
  • _sec_motivation.qmd
  • _sec_polynomial.qmd
  • _sec_regression_splines.qmd
  • _sec_smoothing_splines.qmd
  • _sec_step_functions.qmd

POSIX text files must end in a newline. Add one to each.

2. gratia is used but not declared as a dependency

gratia::draw(), gratia::smooth_samples(), and gratia::compare_smooths() are referenced in _sec_gam_fitting.qmd, _sec_gam_inference.qmd, and _sec_gam_example.qmd. The package is not in DESCRIPTION (Suggests) and not in renv.lock. Either add it as a suggested dependency (renv::install("gratia"); renv::snapshot()) or qualify the prose with "if you install the optional gratia package…" The version note in _sec_gam_inference.qmd (gratia >= 0.9) also floats without being backed by a minimum version in DESCRIPTION.


Should-fix (convention violations)

3. Inconsistent code block syntax in _sec_gam_example.qmd

The worked example section uses plain backtick-r fences throughout (markdown syntax highlighting only), while every other code-containing subfile in this chapter uses {r} Quarto chunk syntax with #| eval: false. These are functionally equivalent when the chunk won't execute, but the inconsistency will confuse readers and break if someone adds #| eval: true later. Recommend converting all blocks to {r} + #| eval: false for consistency, or at minimum add a comment explaining the intentional choice.

4. .content-hidden hides the dataset warning from everyone

In _sec_gam_example.qmd:

:::: {.content-hidden}
::: callout-warning
This section is illustrative only …
:::
::::

{.content-hidden} removes the callout from the rendered HTML output, so neither students nor instructors see the warning. The intent (per PR description) is to flag that MASS::Pima.tr should be replaced with a course dataset. If the goal is an instructor-only reminder, this should be a :::notes block (which renders in presenter view) or a plain Quarto comment. If it's meant as a visible student notice, drop .content-hidden entirely.


Nice-to-have / minor

5. ## Setup {.unnumbered} placement inconsistency

_sec_gam_example.qmd places ## Setup {.unnumbered} inside the subfile. In contrast, for the lab section the parent chapter places ## Setup {.unnumbered} before the {{< include >}}. Neither pattern violates the convention (the rule only prohibits top-level # headings in subfiles), but for consistency across the chapter the ## Setup in the worked example could be moved to the parent.

6. s() argument table missing id column

_sec_gam_fitting.qmd has a useful s() arguments table, but id (linking smooths to share a smoothness parameter across predictors) is a commonly-used argument in multi-predictor epidemiologic models. Worth a one-line addition.


Content quality (no action needed, FYI)

The mathematics is correct throughout:

  • Cubic spline parameter count ($K+4$, reduced to $K$ by natural spline boundary constraints) ✓
  • Truncated power basis continuity proof in the conceptual exercise ✓
  • Smoothing-spline penalty limits ($\lambda \to 0$ interpolates, $\lambda \to \infty$ → OLS line) ✓
  • Backfitting convergence to OLS for linear smoothers ✓
  • EDF interpretation (per-smooth column excludes intercept, so linear smooth shows EDF ≈ 1, not 2) ✓
  • Confidence intervals computed on the link scale then back-transformed — correct and the reasoning is clearly explained ✓
  • method = "ML" required for AIC comparison with glm() — correct and the explanation is the right level of detail ✓

The section on tensor-product smooths (te()) correctly identifies the key point about non-comparable scales and contrasts with isotropic smooths.

Citation check: Both @me4 and @hastie2009esl2e resolve in references.bib — no phantom keys.

— posted by @claude post-step from workflow run

@claude

claude Bot commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

@claude

claude Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Claude Code is working…

I'll analyze this and get back to you.

View job run

Initial draft of a chapter on generalized additive models. Squashed
from the original 69-commit history because the action's recursive
submodule fetch was failing on commits that referenced a now-deleted
latex-macros SHA. See the PR description for the chapter outline,
new bib entries, and caveats.
@d-morrison d-morrison force-pushed the claude/issue-775-gam-chapter branch from 79a7d81 to ef26f15 Compare June 9, 2026 06:28
@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 5m 56s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read all changed files on current branch
  • Check which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post inline findings and summary

Resolved since 2026-06-03 review

The two code-correctness bugs flagged in prior rounds are confirmed fixed:

  • _sec_lab.qmd: plot(XX.ravel(), ...) — 2D array inconsistency fixed ✓
  • _sec_lab.qmd: SplineTransformer unused import removed ✓

Still open: must-fix

1. Nine subfiles missing trailing newlines

POSIX text files must end with \n. Every file below currently ends mid-content with no trailing newline — confirmed by od -An -tx1 tail -c 1:

File Last byte
_exr-gams-applied.qmd 3a (:)
_sec_basis_functions.qmd 2e (.)
_sec_gams.qmd 2e (.)
_sec_local_regression.qmd 2e (.)
_sec_motivation.qmd 2e (.)
_sec_polynomial.qmd 2e (.)
_sec_regression_splines.qmd 60 (`)
_sec_smoothing_splines.qmd 2e (.)
_sec_step_functions.qmd 2e (.)

Fix: add a single \n to the end of each file.

Fix trailing newlines →

2. gratia used in 2 files but not in DESCRIPTION

gratia is called in two subfiles but is absent from both Suggests: in DESCRIPTION and renv.lock:

  • _sec_gam_example.qmd:20library(gratia) in setup block; gratia::draw(m_gam) at line 116
  • _sec_gam_inference.qmd:77gratia::smooth_samples() (with version note >= 0.9); gratia::compare_smooths() at line 97

Without the declaration, R CMD CHECK will warn, the chapter won't render on a fresh environment, and students following the code will hit a missing-package error.

Fix: add gratia (>= 0.9) to Suggests: in DESCRIPTION and run renv::snapshot() to lock the version.

Fix DESCRIPTION →


Still open: should-fix

3. .content-hidden makes the dataset-placeholder warning invisible to everyone

_sec_gam_example.qmd:1–7

:::: {.content-hidden}
::: callout-warning
This section is illustrative only — the dataset is a built-in R
example, not an Epi 204 case study. …
:::
::::

{.content-hidden} removes the callout from all rendered output — HTML, slides, PDF. Neither students nor instructors see it. The intent (per PR description) is to flag the placeholder dataset; if it should only be visible during review, use a Quarto comment (``). If it's a student-visible reminder, drop .content-hidden. Either way, the current form defeats its own purpose.

Fix callout visibility →

4. Display-only R code blocks use markdown syntax instead of Quarto syntax

_sec_gam_example.qmd, _sec_gam_fitting.qmd, _sec_gam_inference.qmd, _sec_local_regression.qmd, _sec_polynomial.qmd, _sec_regression_splines.qmd, and _sec_smoothing_splines.qmd all show R code as plain ```r markdown blocks rather than ```{r} #| eval: false Quarto chunks. The rest of the chapter uses Quarto syntax.

The practical consequence: plain markdown blocks cannot carry #| code-fold: true, #| label:, or any Quarto chunk option, and behave differently in formats like revealjs. For sections that should never execute but should respond to Quarto chunk options, the recommended form is:

```{r}
#| eval: false
#| code-fold: true
… code …
```

Nice-to-have

5. _sec_lab.qmd:36# noqa: E402 on a commented-out import

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The # noqa: E402 annotation suppresses a linter warning for import-not-at-top-of-file. Since the entire import is commented out, the annotation does nothing — but it will puzzle readers who wonder why there's a linter suppression on dead code. Remove the # noqa or the whole line.

6. s() argument table missing id (_sec_gam_fitting.qmd:43–50)

The table lists bs, k, by, m, pc. The id argument (linking smooths to share a smoothness parameter across predictors) is a commonly-used mgcv feature in multi-predictor models and worth a one-line entry.


Summary

# Severity File Line Issue
1 Must-fix 9 subfiles (end-of-file) Missing trailing newlines — carry-over
2 Must-fix _sec_gam_example.qmd, _sec_gam_inference.qmd 20, 116, 77, 97 gratia used but not in DESCRIPTION — carry-over
3 Should-fix _sec_gam_example.qmd 1–7 .content-hidden hides placeholder warning from everyone — carry-over
4 Should-fix 7 subfiles throughout Display R blocks use markdown ```r not Quarto {r} — carry-over
5 Low _sec_lab.qmd 36 Orphaned # noqa: E402 on commented-out import — new
6 Low _sec_gam_fitting.qmd 43–50 s() table missing id argument — carry-over

Items 1 and 2 are blockers for a clean build on a fresh environment. Items 3–6 are convention/clarity issues. The mathematical content and code correctness of the chapter remain sound.

claude added 2 commits June 9, 2026 08:00
…deps

- Add trailing newlines to 9 _subfiles/generalized-additive-models/
  files. POSIX requires text files to end with a newline; many tools
  (cat, sed, diff, git's last-line tracking) misbehave without one.
- Convert 7 subfiles' display-only ```r blocks to ```{r}\n#| eval: false
  so they render as code fences without trying to evaluate. Keeps
  the prose-illustration semantics but pulls the snippets into
  Quarto's code-display path (gets syntax highlighting + copy
  button + per-language config), instead of the plain Pandoc one.
- _sec_gam_example.qmd: remove the outer .content-hidden div around
  the callout-warning. The whole point of the warning ("dataset is
  illustrative; swap for an Epi 204 case study") is to be visible
  to readers; hiding it defeats it. The unwrapped callout shows in
  HTML, slides, and PDF, which is the intended audience.
- DESCRIPTION: declare gratia (>=0.9) and mgcv in Suggests. Both
  are loaded in the chapter (eval:false), and listing them lets
  renv resolve consistent versions when the chapter ever gets
  flipped to eval:true.
@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 4m 49s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read all changed files on current branch
  • Check which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post findings and summary

Resolved since 2026-06-03 review (4 must-fix / should-fix items)

Commit 719594b ("GAM chapter cleanups: trailing newlines, eval:false, content-hidden, deps") addressed all four outstanding blockers:

# File What was fixed
1 9 _subfiles/ files Trailing newlines added — all files now end in \n
2 DESCRIPTION gratia (>= 0.9) and mgcv declared in Suggests:
3 _sec_gam_example.qmd:1–7 .content-hidden wrapper removed; callout-warning is now visible in HTML, slides, and PDF
4 7 subfiles Standalone display-only ```r blocks converted to ```{r} `#

Still open — three low-priority items

1. _sec_lab.qmd:36 — Orphaned # noqa: E402 on a commented-out import (carry-over)

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The E402 suppression is an import-order annotation. Since the entire import is commented out, the annotation silences a linter warning that can't fire — it confuses readers who wonder why dead code needs a linter bypass.

Fix: remove the # noqa: E402 suffix (or remove the whole commented line, since the active import from pygam import LinearGAM, s as gam_s appears at line 238 of the smoothing-spline block).

Fix →


2. _sec_gams.qmd:95 — One ```r block missed by the cleanup commit (new)

The cleanup commit converted standalone display-only ```r blocks in 7 subfiles, but one indented single-liner inside a list bullet in _sec_gams.qmd was missed:

  the correct idiom is:
  ```r
  gam(y ~ sex + s(age, by = sex), data = d, method = "REML")

This is a short inline example embedded inside a bullet point. It still renders, but is inconsistent with the rest of the chapter (which now uses `` ```{r} `` throughout).

Fix: convert to `` ```{r} `` with `#| eval: false`.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gams.qmd%60%20line%2095%2C%20change%20the%20%60%60%60%60r%60%20fence%20to%20%60%60%60%60%7Br%7D%60%20and%20add%20%60%23%7C%20eval%3A%20false%60%20on%20the%20next%20line.%20This%20one-liner%20inside%20the%20predictor-by-factor-interactions%20list%20item%20was%20missed%20by%20the%20recent%20cleanup%20commit%20that%20converted%20standalone%20blocks%20in%207%20other%20subfiles.&repo=d-morrison/rme)

---

#### 3. `_sec_gam_fitting.qmd:44–50` — `s()` argument table missing `id` (carry-over)

The table lists `bs`, `k`, `by`, `m`, `pc`. The `id` argument (links two smooths to share a single smoothness parameter, e.g. `s(x1, id = 1) + s(x2, id = 1)`) is a commonly-used option in multi-predictor epidemiologic models and worth one row.

[Add `id` row →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gam_fitting.qmd%60%20in%20the%20%60s()%60%20argument%20table%20(lines%2044-50)%2C%20add%20a%20row%20for%20the%20%60id%60%20argument%3A%20it%20links%20two%20or%20more%20smooths%20to%20share%20a%20single%20smoothness%20parameter%20(e.g.%20%60s(x1%2C%20id%20%3D%201)%20%2B%20s(x2%2C%20id%20%3D%201)%60)%20and%20is%20a%20commonly%20used%20option%20in%20multi-predictor%20models.&repo=d-morrison/rme)

---

### Summary

| # | Severity | File | Line | Issue |
|---|----------|------|------|-------|
| 1 | Low | `_sec_lab.qmd` | 36 | Orphaned `# noqa: E402` on commented-out import — carry-over |
| 2 | Low | `_sec_gams.qmd` | 95 | One `` ```r `` block not converted to `` ```{r} `` — new |
| 3 | Low | `_sec_gam_fitting.qmd` | 44–50 | `s()` table missing `id` argument — carry-over |

All must-fix blockers from previous rounds are confirmed resolved. The three remaining items are all low-severity convention/completeness issues; none block rendering or correctness. The mathematical content and code correctness of the chapter remain sound.

Replace the built-in MASS::Pima.tr placeholder with the course's WCGS
coronary-heart-disease cohort, addressing the review concern that the
worked example was not an Epi 204 case study (and the student-visible
'illustrative only / replace dataset' callout-warning).

- Outcome: incident CHD (chd69); smoothed continuous predictors:
  age, bmi, chol, sbp.
- Load WCGS from the package's bundled inst/extdata/wcgs.rds.
- Drop the placeholder callout-warning.
- Adjust the narrative to WCGS reality (most smooths near EDF 1,
  cholesterol the predictor most likely to show curvature) without
  asserting specific run-dependent numbers.

Chunks remain eval:false, consistent with the section's existing
style and the book's other WCGS code listings.

https://claude.ai/code/session_01NQSLJmjMN22RpwpgJ6Lw6u
@claude

claude Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 5m 3s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read all changed files on current branch
  • Check which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post findings and summary

Resolved since 2026-06-09 review

The WCGS dataset commit (91b60e9) addressed the last open blocker from the 2026-06-03 review:

# File What was fixed
1 _sec_gam_example.qmd MASS::Pima.tr replaced with WCGS (wcgs.rds); "illustrative only" callout removed; narrative updated for CHD outcome

Previous blockers confirmed still resolved from 719594b:

  • Trailing newlines on 9 subfiles ✓
  • gratia (>= 0.9) and mgcv in DESCRIPTION ✓ (confirmed at DESCRIPTION:49–54)
  • All standalone ```r display blocks converted to ```{r} #| eval: false

wcgs.rds exists at inst/extdata/wcgs.rds ✓ — the system.file() call will resolve correctly.


Still open — three low-priority items (all carry-overs)

1. _sec_lab.qmd:36 — Orphaned # noqa: E402 on commented-out import

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f  # noqa: E402

The import is commented out; the linter annotation does nothing and puzzles readers. The active pygam import appears at line 238.

Fix →

2. _sec_gams.qmd:95 — One ```r block missed by the cleanup commit

  the correct idiom is:
  ```r
  gam(y ~ sex + s(age, by = sex), data = d, method = "REML")

This indented inline example inside the `by = sex` bullet was missed by `719594b`. All other display-only R blocks in the chapter now use `` ```{r} #| eval: false ``.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gams.qmd%60%20line%2095%2C%20change%20the%20%60%60%60%60r%60%20fence%20to%20%60%60%60%60%7Br%7D%60%20and%20add%20%60%23%7C%20eval%3A%20false%60%20as%20the%20first%20line.%20This%20inline%20example%20inside%20the%20%60by%20%3D%20sex%60%20list%20item%20was%20missed%20by%20the%20recent%20cleanup%20commit%20that%20converted%20all%20other%20standalone%20blocks%20in%20this%20chapter.&repo=d-morrison/rme)

#### 3. `_sec_gam_fitting.qmd:44–50` — `s()` argument table missing `id`

The table lists `bs`, `k`, `by`, `m`, `pc`. The `id` argument (links two smooths to share a single smoothness parameter across predictors, e.g. `s(x1, id = 1) + s(x2, id = 1)`) is commonly used in multi-predictor epidemiologic models and worth one row.

[Fix →](https://claude.ai/code?q=In%20%60_subfiles%2Fgeneralized-additive-models%2F_sec_gam_fitting.qmd%60%20in%20the%20%60s()%60%20argument%20table%20(lines%2044-50)%2C%20add%20a%20row%20for%20the%20%60id%60%20argument%3A%20%22links%20two%20or%20more%20smooths%20to%20share%20a%20single%20smoothness%20parameter%20(e.g.%20%60s(x1%2C%20id%20%3D%201)%20%2B%20s(x2%2C%20id%20%3D%201)%60)%3B%20useful%20when%20the%20same%20exposure%20appears%20in%20multiple%20smooths%22&repo=d-morrison/rme)

---

### Summary

| # | Severity | File | Line | Issue |
|---|----------|------|------|-------|
| 1 | Low | `_sec_lab.qmd` | 36 | Orphaned `# noqa: E402` on commented-out import — carry-over |
| 2 | Low | `_sec_gams.qmd` | 95 | `` ```r `` block not converted to `` ```{r} `` — carry-over |
| 3 | Low | `_sec_gam_fitting.qmd` | 44–50 | `s()` table missing `id` argument — carry-over |

All must-fix blockers from previous rounds are confirmed resolved: trailing newlines, `gratia` dependency, `.content-hidden` wrapper, and the WCGS dataset replacement. The mathematical content, code correctness, and pedagogical structure of the chapter are sound. The three remaining items are low-severity convention/completeness issues — none block rendering or correctness.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor
PR Preview Action v1.8.1
🚀 Preview available at: https://d-morrison.github.io/rme/pr-preview/pr-779/

Built to branch gh-pages at 2026-06-09 20:58 UTC.

d-morrison and others added 2 commits June 18, 2026 10:32
…chapter

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
1. _sec_lab.qmd:36 — remove the # noqa: E402 trailing the commented-
   out pygam import. The import is already commented out, so the
   linter suppression is meaningless and just noise.

2. _sec_gams.qmd:95 — the inline `s(age, by = sex)` example was the
   one ```r block missed by the prior cleanup commit; convert to
   ```{r} #| eval: false to match the rest of the chapter.

3. _sec_gam_fitting.qmd: add the `id` argument to the s() argument
   table — used to share a single smoothness parameter across two
   or more smooths (e.g. s(x1, id = 1) + s(x2, id = 1)). Commonly
   needed in multi-predictor epi models.

Also merged origin/main (the branch was 125 commits behind).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
@claude

claude Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 4m 50s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read all changed files on current branch
  • Check which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post findings

Resolved since 2026-06-09 review (3/3)

All three low-priority items from the previous round are confirmed fixed:

# File What was fixed
1 _sec_lab.qmd:36 Orphaned # noqa: E402 removed from commented-out import
2 _sec_gams.qmd:95 Inline ```r block inside list bullet converted to ```{r} with `#
3 _sec_gam_fitting.qmd:49 id row added to s() argument table

All must-fix blockers from prior rounds remain resolved:

  • Trailing newlines on 9 subfiles ✓ (0 "No newline at end of file" in current diff)
  • gratia (>= 0.9) and mgcv in DESCRIPTION:49–53
  • SplineTransformer unused import removed ✓
  • XX.ravel() used consistently in smoothing-spline Python plot ✓
  • Bare --- separators removed from both exercise subfiles ✓

New findings

1. (Very low) Commented-out import in setup block is now just dead code

_sec_lab.qmd:36

# from pygam import LinearGAM, LogisticGAM, s as gam_s, f as gam_f

The # noqa: E402 annotation was correctly removed, but the commented-out line itself remains. It imports LogisticGAM and f as gam_f, both of which appear in their own import statements within the chunks that use them (lines 293 and 350). The setup-block comment has no purpose and may puzzle students. Remove the whole line.


2. (Very low) Two remaining references to Pima data in illustration examples

  • _sec_gams.qmd:41#exm-gam still says "For the Pima diabetes data…" in a purely conceptual formula example (no code).
  • _sec_local_regression.qmd:40#exm-local-regression has loess(glu ~ age, data = MASS::Pima.tr, span = 0.5) as a quick LOESS syntax illustration.

These are not the main worked example (which was correctly updated to WCGS), and both use MASS::Pima.tr, which is always available from a CRAN package and never broken by a missing local file. Per the PR description, only the main worked example needed updating in this PR. Noting for follow-up completeness.


Summary

# Severity File Line Issue
1 Very low _sec_lab.qmd 36 Commented-out import is now just dead code; remove it
2 Very low _sec_gams.qmd, _sec_local_regression.qmd 41, 40 Remaining Pima references in conceptual illustration examples (low-risk, follow-up PR)

The chapter is otherwise clean. All prior must-fix and should-fix items are confirmed resolved. The mathematical content, code correctness, and pedagogical structure remain sound. The two remaining notes are housekeeping items that do not block merge.

1. _sec_lab.qmd:36 — drop the commented-out pygam import entirely.
   The active imports of LogisticGAM/gam_f live in the chunks that
   use them (lines 293, 350), so the setup-block comment was just
   dead code that puzzled students.

2. _sec_gams.qmd:38 — switch #exm-gam conceptual example from "Pima
   diabetes data with bmi/glu/age → diabetes" to "WCGS data with
   age/sbp/chol → chd69" so it matches the worked example's WCGS
   framing.

3. _sec_local_regression.qmd:34 — switch #exm-local-regression from
   loess(glu ~ age, MASS::Pima.tr) to loess(chol ~ age, wcgs) for
   the same reason; pivot the surrounding prose from age 40/glucose
   to age 50/cholesterol to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
@claude

claude Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 4m 17s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read current state of changed files
  • Check which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post findings

Resolved since last review (2/2)

Both "very low" items from the 2026-06-18 review are confirmed fixed:

# File What was fixed
1 _sec_lab.qmd:36 Commented-out dead import removed entirely (no trace of the line or its # noqa: E402 annotation)
2 _sec_gams.qmd:38–51, _sec_local_regression.qmd:40–43 Both remaining Pima references updated to WCGS: exm-gam now uses WCGS CHD example; LOESS example uses loess(chol ~ age, data = wcgs, ...)

All must-fix blockers from prior rounds remain resolved:

  • Trailing newlines: confirmed 0 "No newline at end of file" entries in the full diff ✓
  • gratia (>= 0.9) in DESCRIPTION:49
  • All display-only ```r blocks converted to ```{r} #| eval: false
  • XX.ravel() used consistently in Python smoothing-spline plot ✓
  • Bare --- separators removed from exercise subfiles ✓
  • LOESS correctly described as lacking a parametric formula (not limited to training-point predictions) ✓
  • AIC comment gives the right reason (REML/ML scale mismatch, not fixed-effects structure) ✓

New finding (low priority)

gratia declared in DESCRIPTION but absent from renv.lock

DESCRIPTION:49 declares gratia (>= 0.9) in Suggests:, and mgcv is already in renv.lock (296 packages). gratia is not. This is expected behaviour for a Suggests-only dependency whose code all carries #| eval: falserenv::snapshot() only records packages that are actually executed in the project environment, so it will never add gratia automatically.

The practical consequence: renv::restore() on a fresh clone will not install gratia, and students who want to run the gratia::draw() / gratia::smooth_samples() code in _sec_gam_example.qmd and _sec_gam_inference.qmd will need to install it separately. Since the code is already annotated library(gratia) # tidy plots for gam fits (optional) and the DESCRIPTION declares the version floor, this is self-documenting.

If the project ever adds a rendered vignette or example that actually runs gratia code, a manual renv::install("gratia"); renv::snapshot() would be needed at that point. For now this is tracked-but-deferred.


Overall verdict

The chapter is clean. All must-fix and should-fix items from previous rounds are confirmed resolved; the single remaining note is a low-priority renv tracking issue that does not affect rendering, correctness, or student experience (given the (optional) annotation). The mathematical content, code correctness, and pedagogical structure of the 11-section chapter remain sound.

DESCRIPTION:49 declares gratia (>= 0.9) in Suggests, but
renv::snapshot() doesn't auto-detect it because all gratia code
lives in eval:false chunks. Force-record via renv::record('gratia')
so renv::restore() on a fresh clone installs it.

Other Suggests dependencies are already present in renv.lock (mgcv,
ISLR2, boot, etc.), so this brings gratia into line with that
pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
@claude

claude Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @d-morrison's task in 4m 31s —— View job


Code Review — GAM Chapter (PR #779)

  • Gather context from prior review rounds
  • Read all changed files on current branch
  • Confirm which prior-round issues are resolved
  • Fresh review pass on all changed files
  • Post findings

Resolved since last review (all 3 carry-overs)

Commits 225baa7, 7ad9aa8, and 46f37d8 addressed every remaining item:

# File What was fixed
1 _sec_lab.qmd:36 Commented-out pygam import removed entirely (commit 7ad9aa8)
2 _sec_gams.qmd:38–51, _sec_local_regression.qmd:34–51 Both remaining Pima references updated to WCGS (commit 7ad9aa8)
3 renv.lock gratia 0.11.2 recorded via renv::record() so renv::restore() installs it on a fresh clone (commit 46f37d8)

All must-fix and should-fix items from prior rounds confirmed still resolved:

  • Trailing newlines: no \ No newline at end of file in any file's diff ✓
  • gratia (>= 0.9) and mgcv in DESCRIPTION:49,51
  • All display-only R blocks use ```{r} #| eval: false (including the one-liner in _sec_gams.qmd:95-98) ✓
  • XX.ravel() used consistently in Python smoothing-spline plot ✓
  • Bare --- separators removed from both exercise subfiles ✓
  • LOESS correctly described as lacking a parametric formula ✓
  • AIC comment gives the right reason (REML/ML scale mismatch) ✓
  • id argument row added to s() argument table ✓
  • # noqa: E402 orphaned annotation removed ✓

Citation audit

All six BibTeX keys introduced by this PR resolve in references.bib:
@aune2016bmi (line 1706, with pages = {i2156}), @gasparrini2015mortality (line 1716), @wood2013pvalues (line 1727), @james2023islp (line 1738), @wood2017generalized (line 797), @me4 (line 1245). ✓

Fresh review: no new issues found

The chapter is clean. All must-fix blockers and should-fix items from all prior rounds are confirmed resolved. The mathematical content, code correctness, and pedagogical structure of the 11-section chapter are sound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants