Sample size em edits by d-morrison · Pull Request #369 · UCD-SERG/serocalculator

d-morrison · 2025-02-04T01:03:50Z

Thank you for taking the time to submit a pull request!

To maximize the chances of acceptance:

The title of your PR should briefly describe the change.
The body of your PR should contain Fixes #issue-number (if relevant).
Commit/merge messages to be included in NEWS.md should begin with -.
Code should follow the tidyverse style guide.
Documentation should use roxygen2, with Markdown syntax.
Contributions should include unit tests (using testthat).

For more information see Contributing.

--author "Kwan Ho Lee" <ksjlee@ucdavis.edu>

codecov · 2025-02-04T01:07:46Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

Copilot

Pull request overview

This PR adds a new data-raw/ script to run parallelized seroincidence simulations for exploring sample-size behavior, and updates package metadata to include additional optional dependencies used by supporting workflows/scripts.

Changes:

Add data-raw/sample-size.R to simulate cross-sectional serology data and summarize CI coverage across scenarios.
Add furrr and svglite to Suggests in DESCRIPTION (for optional parallel mapping and SVG-based visual regression testing support).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
DESCRIPTION	Adds optional dependencies (`furrr`, `svglite`) needed by scripts/testing tools.
data-raw/sample-size.R	New simulation + summarization script for evaluating sample size / CI coverage under different incidence rates.

+    est <- est.incidence(
+      pop_data = csdata,
+      curve_params = dmcmc,
+      noise_params = cond,
+      lambda_start = 0.1,
+      build_graph = FALSE,
+      verbose = FALSE,
+      print_graph = FALSE,
+      antigen_isos = antibodies
+    )


+## Load longitudinal parameters
+
+test_sim <- "https://osf.io/download/rtw5k" %>%
+  load_curve_params() %>%


+    csdata <- sim.cs(
+      curve_params = dmcmc,
+      lambda = lambda,
+      n.smpl = nrep,
+      age.rng = lifespan,
+      antigen_isos = antibodies,
+      n.mc = 0,
+      renew.params = TRUE,  # Use different parameters for each simulation
+      add.noise = TRUE,
+      noise_limits = dlims,
+      format = "long"
+    )


+results_100 <- simulate_seroincidence(lambda = 0.2, nrep = 100, n_sim = 300)
+
+results_100_0.1 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)
+
+results_100_0.01 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)
+


+
+results_100_0.1 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)
+
+results_100_0.01 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)


+final_table_50 <- results_50 |> generate_final_table(sample_size = 50)
+final_table_100 <- results_100 |> generate_final_table(sample_size = 100)
+final_table_100_0.1 <- results_100_0.1 |> generate_final_table(sample_size = 100)
+ft_100_0.01 <- results_100_0.01 |> generate_final_table()
+lambda <- function(results_table)


claude · 2026-06-03T19:48:34Z

Claude finished @d-morrison's task in 2m 38s —— View job

Code Review

Gather context and read changed files
Review DESCRIPTION changes
Review data-raw/sample-size.R
Post findings

Summary

This PR adds data-raw/sample-size.R — a simulation script for exploring CI coverage under different sample sizes and incidence rates — and adds furrr and svglite to Suggests. The DESCRIPTION changes are fine. The script has one critical runtime error, two lint failures that will block CI, and several deprecated API calls and reproducibility gaps.

Critical Bug — Will Error at Runtime

data-raw/sample-size.R, line 55–64 — est.incidence() is a pure ...-passthrough to est_seroincidence(). The argument curve_params = dmcmc has no matching parameter in est_seroincidence() (which uses sr_params, not curve_params); it will be absorbed by ... silently. But sr_params has no default value, so the call will immediately fail with argument "sr_params" is missing, with no default. The function will never run successfully as written.

Fix: replace the block with the current API:

est <- est_seroincidence(
  pop_data    = csdata,
  sr_params   = dmcmc,
  noise_params = cond,
  lambda_start = 0.1,
  build_graph  = FALSE,
  verbose      = FALSE,
  antigen_isos = antibodies
)

Fix this →

CI Lint Failures — Will Block Merge

Lines 102, 104, 152, 153 — Object names containing . violate the repo's object_name_linter (snake_case). The .lintr.R excludes pipe_consistency_linter and undesirable_function_linter for data-raw/, but object_name_linter is not excluded and applies here.

Line	Current name	Valid snake_case alternative
102	`results_100_0.1`	`results_100_0_1`
104	`results_100_0.01`	`results_100_0_01`
152	`final_table_100_0.1`	`final_table_100_0_1`
153	`ft_100_0.01`	`ft_100_0_01`

All downstream references to these names (lines 152, 153, 178–179) must be updated consistently.

Fix this →

Logic Bug — Mislabeled Result

Line 104 — results_100_0.01 is named to suggest λ=0.01, but simulate_seroincidence(lambda = 0.1, ...) passes λ=0.1 — the same as line 103. The coverage table it feeds into (ft_100_0.01) will therefore report results for the wrong scenario. Either change lambda = 0.1 to lambda = 0.01 or rename the object. (Copilot flagged this too.)

Reproducibility — Missing Seeds

Lines 102 and 104 — results_50 (line 97) and results_100 (line 100) each have a set.seed() call before them, which seeds furrr's parallel RNG via furrr_options(seed = TRUE). The two subsequent simulate_seroincidence() calls at lines 102 and 104 have no seed, so their output will differ across runs. Add set.seed() calls before each.

Deprecated API Calls — Will Emit Lifecycle Warnings

All three deprecated wrappers emit lifecycle::deprecate_soft() warnings and will eventually be removed. data-raw/ scripts are run by maintainers and should stay aligned with the current public API.

Line	Deprecated call	Current replacement
10	`load_curve_params(url)`	`load_sr_params(url)`
41	`sim.cs(curve_params=, n.smpl=, age.rng=, n.mc=, renew.params=, add.noise=, ...)`	`sim_pop_data(curve_params=, n_samples=, age_range=, n_mcmc_samples=, renew_params=, add_noise=, ...)`
55	`est.incidence(pop_data=, curve_params=, noise_params=, ...)`	`est_seroincidence(pop_data=, sr_params=, noise_params=, ...)` (see Critical Bug above)

Fix this →

Minor Issues

Line 92: library(future) is called in the middle of the script, after the function definition. Move all library() calls to the top alongside the existing ones (lines 1–4). (undesirable_function_linter is excluded for data-raw/, so no lint failure, but it hurts readability.)
Line 120: 1:length(results_list) is fragile when results_list is empty (produces c(1, 0)). Prefer seq_along(results_list).
DESCRIPTION / future: The script calls future::plan() directly (line 93, 107) but future is not listed in Suggests. Since furrr depends on future, it will be available transitively, but explicit declaration makes the dependency intent clear.

DESCRIPTION Changes

furrr and svglite additions are correctly placed in Suggests (alphabetical order is preserved for furrr; svglite is placed between tidyverse and vdiffr which is fine). No concerns here.

claude · 2026-06-03T20:13:06Z

Claude Code is working…

I'll analyze this and get back to you.

View job run

- Replace deprecated load_curve_params/sim.cs/est.incidence with load_sr_params/sim_pop_data/est_seroincidence (and sr_params arg) - Rename objects with dots to snake_case (object_name_linter) - Fix mislabeled lambda=0.01 scenario and add per-call seeds - Use seq_along(); declare future in Suggests; move library(future) to top

d-morrison · 2026-06-03T20:14:01Z

Addressed the review findings in data-raw/sample-size.R:

Critical/deprecated API: replaced load_curve_params() → load_sr_params(), sim.cs() → sim_pop_data() (renamed n.smpl/age.rng/n.mc/renew.params/add.noise → n_samples/age_range/n_mcmc_samples/renew_params/add_noise), and est.incidence(curve_params=) → est_seroincidence(sr_params=) (fixes the missing-sr_params runtime error).
snake_case: renamed results_100_0.1/results_100_0.01/final_table_100_0.1/ft_100_0.01 to underscore forms and updated all references.
Logic fix: the λ=0.01 scenario now actually passes lambda = 0.01.
Reproducibility: added set.seed() before the two unseeded runs.
Minor: seq_along() instead of 1:length(), moved library(future) to the top, and declared future in Suggests.

This is an internal maintainer script in data-raw/ (not user-visible / not installed), so I applied the no changelog + no version increment labels for the version-check and changelog gates, and added gh to WORDLIST to fix spellcheck.

@claude review

Generated by Claude Code

claude · 2026-06-03T20:14:49Z

Claude finished @d-morrison's task in 2m 16s —— View job

Code Review

Gather context and read changed files
Verify previous findings were addressed
Check for new issues
Post findings

Summary

All findings from the previous review have been addressed. The script is substantially improved. A few minor remaining items below — none are blocking for a data-raw/ maintainer script.

Previous Findings — All Resolved ✅

Finding	Status
Critical: `est.incidence(curve_params=)`	✅ Fixed → `est_seroincidence(sr_params=)`
Lint: dots in object names	✅ Fixed → snake_case underscores
Logic: λ=0.01 scenario used λ=0.1	✅ Fixed → `lambda = 0.01` on line 105
Seeds missing before two simulation calls	✅ Fixed → `set.seed()` before all 4 calls
Deprecated `load_curve_params()`	✅ Fixed → `load_sr_params()`
Deprecated `sim.cs()`	✅ Fixed → `sim_pop_data()`
`1:length()` fragile	✅ Fixed → `seq_along()`
`library(future)` mid-script	✅ Fixed → moved to top
`future` not in `Suggests`	✅ Fixed

Remaining Minor Items

Line 77 — 1:n_sim still fragile

The previous fix changed 1:length(results_list) → seq_along(), but 1:n_sim (same class of issue) remains:

furrr::future_map(.x = 1:n_sim, ...)

Prefer seq_len(n_sim) — if n_sim = 0 (e.g. testing), 1:0 produces c(1, 0) and runs two iterations.

Fix this →

Lines 179–180 — Coverage not printed

coverage_count_50 and coverage_count_100 are captured and printed (lines 175–177), but the λ=0.1 and λ=0.01 results are silently dropped:

final_table_100_0_1 |> compute_coverage()   # result discarded
ft_100_0_01 |> compute_coverage()           # result discarded

When run non-interactively (e.g. Rscript data-raw/sample-size.R) these results are lost. For consistency, either assign or pipe through print().

Lines 182–183 — paste() on a data frame

compute_coverage() returns a data frame, so:

print(paste("Number of rows where CI covers true lambda:", coverage_count_50))

will produce something like "Number of rows where CI covers true lambda: c(no_error_count = …)" rather than a useful message. Consider cat() on individual columns or just print(coverage_count_50) directly.

DESCRIPTION / WORDLIST

furrr, future, and svglite correctly placed in Suggests (alphabetical order maintained). gh in WORDLIST is fine. No concerns.

github-actions · 2026-06-03T20:15:27Z

👀 Picked up by workflow run #26910326827. R setup runs first; Claude itself responds after that.

github-actions · 2026-06-03T20:39:33Z

📖 https://UCD-SERG.github.io/serocalculator/preview/pr369/
Preview documentation for this PR (at commit 4f21266)

d-morrison · 2026-06-03T20:41:21Z

@claude review

Generated by Claude Code

github-actions · 2026-06-03T20:41:40Z

👀 Picked up by workflow run #26911728637. R setup runs first; Claude itself responds after that.

d-morrison added 4 commits January 30, 2025 10:04

add sample size code

b465637

--author "Kwan Ho Lee" <ksjlee@ucdavis.edu>

clarify namespaces

0e2c4be

in progress

0862d29

Merge commit '9521a00cf6ae60c20205610abb5123537b58408d'

6aa712c

d-morrison added 2 commits February 4, 2025 10:30

Merge branch 'main' into sample-size-em-edits

856a1df

Merge branch 'main' into sample-size-em-edits

95d8324

d-morrison marked this pull request as ready for review May 19, 2026 02:40

Copilot AI review requested due to automatic review settings May 19, 2026 02:40

Merge branch 'main' into sample-size-em-edits

c894533

Copilot started reviewing on behalf of d-morrison May 19, 2026 02:41 View session

Copilot AI reviewed May 19, 2026

View reviewed changes

github-actions Bot added a commit that referenced this pull request May 19, 2026

Update PR #369 preview

84e238b

Merge branch 'main' into sample-size-em-edits

aeb3f00

github-actions Bot added a commit that referenced this pull request Jun 3, 2026

Update PR #369 preview

2535b84

Add 'gh' to WORDLIST to fix spellcheck

6b70b2f

d-morrison added no changelog no version increment labels Jun 3, 2026 — with Claude

Bump dev version to 1.4.0.9014

4f21266

d-morrison removed the no version increment label Jun 3, 2026 — with Claude

github-actions Bot added a commit that referenced this pull request Jun 3, 2026

Update PR #369 preview

edebe14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sample size em edits#369

Sample size em edits#369
d-morrison wants to merge 11 commits into
mainfrom
sample-size-em-edits

d-morrison commented Feb 4, 2025

Uh oh!

codecov Bot commented Feb 4, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		results_100_0.1 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)

		results_100_0.01 <- simulate_seroincidence(lambda = 0.1, nrep = 100, n_sim = 300)

Conversation

d-morrison commented Feb 4, 2025

Uh oh!

codecov Bot commented Feb 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Critical Bug — Will Error at Runtime

CI Lint Failures — Will Block Merge

Logic Bug — Mislabeled Result

Reproducibility — Missing Seeds

Deprecated API Calls — Will Emit Lifecycle Warnings

Minor Issues

DESCRIPTION Changes

Uh oh!

claude Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

claude Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Summary

Previous Findings — All Resolved ✅

Remaining Minor Items

DESCRIPTION / WORDLIST

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

d-morrison commented Jun 3, 2026

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Feb 4, 2025 •

edited

Loading

claude Bot commented Jun 3, 2026 •

edited

Loading

claude Bot commented Jun 3, 2026 •

edited

Loading