Skip to content

Merge dev-tidy into main#1509

Open
ericward-noaa wants to merge 9 commits into
mainfrom
dev-tidy
Open

Merge dev-tidy into main#1509
ericward-noaa wants to merge 9 commits into
mainfrom
dev-tidy

Conversation

@ericward-noaa

@ericward-noaa ericward-noaa commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

What is the feature?

#1452
Add broom/yardstick compatibility via tidy(), glance(), and augment()

This PR includes

  • tidy.FIMSFit: one row per parameter with generics column conventions
    (term, estimate, std.error, statistic, p.value); supports conf.int
    and filtering by estimation_type
  • glance.FIMSFit: one-row model summary including logLik, AIC, BIC,
    max_gradient, converged, and terminal_ssb
  • augment.FIMSFit: observed vs. expected pairs renamed to yardstick
    conventions (.truth, .pred, .weight) for direct use with
    yardstick metric functions

Also adds get_fit_metrics() and get_fit_stream() as convenience
wrappers for computing grouped yardstick metrics from a FIMSFit.

In get_estimates(): obj$gr() was being
called twice — once in FIMSFit() and again in reshape_tmb_estimates().
The gradient vector is now computed once, stored
in a new FIMSFit@gradient slot, and passed through to avoid the second
C++ call after clear() has freed memory.


Instructions for code reviewer

👋Hello reviewer👋, thank you for taking the time to review this PR!

  • Please use this checklist during your review, checking off items that you have verified are complete but feel free to skip over items that are not relevant!
  • See the GitHub documentation for how to comment on a PR to indicate where you have questions or changes are needed before approving the PR.
  • Please use standard conventional messages for both commit messages and comments
  • PR reviews are a great way to learn so feel free to share your tips and tricks. However, when suggesting changes to the PR that are optional please include nit: (for nitpicking) as the comment type. For example, nit: I prefer using a data.frame() instead of a matrix because ...
  • Engage with the developer. Make it clear when the PR is approved by selecting the approved status, and potentially commenting on the PR with something like This PR is now ready to be merged.

Checklist

  • The code is well-designed
  • The code is designed well for both users and developers
  • Code coverage remains high- [ ] Comments are clear, useful, and explain why instead of what
  • Code is appropriately documented (doxygen and roxygen)

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.56477% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.52%. Comparing base (9ec9dd5) to head (bb96b4b).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
R/augment-fims.R 76.34% 22 Missing ⚠️
R/reshape_output.R 85.71% 1 Missing ⚠️
R/tidy-glance-fims.R 98.80% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1509      +/-   ##
==========================================
- Coverage   87.53%   87.52%   -0.01%     
==========================================
  Files          97       99       +2     
  Lines        8749     8937     +188     
  Branches      516      523       +7     
==========================================
+ Hits         7658     7822     +164     
- Misses       1053     1078      +25     
+ Partials       38       37       -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ericward-noaa and others added 8 commits June 9, 2026 15:15
Implements the three standard broom generics for FIMSFit objects:

- tidy.FIMSFit: one row per parameter with broom column conventions
  (term, estimate, std.error, statistic, p.value); supports conf.int
  and filtering by estimation_type
- glance.FIMSFit: one-row model summary including logLik, AIC, BIC,
  max_gradient, converged, and terminal_ssb
- augment.FIMSFit: observed vs. expected pairs renamed to yardstick
  conventions (.truth, .pred, .weight) for direct use with
  yardstick metric functions

Also adds get_fit_metrics() and get_fit_stream() as convenience
wrappers for computing grouped yardstick metrics from a FIMSFit.

Fixes a segfault in get_estimates(): obj$gr() was being
called twice — once in FIMSFit() and again in reshape_tmb_estimates().
The gradient vector is now computed once at construction time, stored
in a new FIMSFit@gradient slot, and passed through to avoid the second
C++ call after clear() has freed memory.
@kellijohnson-NOAA

Copy link
Copy Markdown
Contributor

@ericward-noaa at first glance all of this is GREAT! Thank you. I am wondering if we should rename "observed" and "predicted" to ".true" and ".estimated" so we do not have to rename anything?

@ericward-noaa

Copy link
Copy Markdown
Contributor Author

That's a good suggestion @kellijohnson-NOAA , I'd defer to you and others. 'observed' and 'expected' or 'observed' and 'predicted' are a little more intuitive for interpretation -- and ".true" and ".estimated" are what yardstick needs. There was another comment on a recent issue about periods confusing people, so I think that is the tradeoff

@e-perl-NOAA

Copy link
Copy Markdown
Contributor

I am the person that commented about confusion with the ".xxx" notation. I think my main concern with that is putting yet another different type of notation in front of people. Additionally, (and this may just be me thing), I associate the period notation with python, though I know that ggplot uses it for some of their functions (which I also don't love). However, this is BY NO MEANS a 🗻 that I will ⚰️ on. It's not even a hill that I would climb up 🤣.

@kellijohnson-NOAA kellijohnson-NOAA left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few changes and comments but they are pretty minor. I think this PR is a great step forward. There are some places where we could make the FIMS codebase better in the future so less has to be done inside these functions but I think we should merge this in and then let early adopters play around with things before we make any additional changes to fit_fims() or other functions.

Comment thread R/augment-fims.R
Comment on lines +205 to +210
if (!requireNamespace("yardstick", quietly = TRUE)) {
cli::cli_abort(
"Package {.pkg yardstick} is required. Install it with
{.code install.packages('yardstick')}."
)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{yardstick} is in Imports section of the the DESCRIPTION file so this seems unnecessary.

Comment thread R/augment-fims.R
Comment on lines +100 to +109
dplyr::select(
dplyr::all_of(meta_cols),
".truth" = "observed",
".pred" = "expected",
dplyr::any_of("uncertainty")
) |>
dplyr::mutate(
.truth = as.numeric(.data$.truth),
.pred = as.numeric(.data$.pred)
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner to mutate first, then select.

Suggested change
dplyr::select(
dplyr::all_of(meta_cols),
".truth" = "observed",
".pred" = "expected",
dplyr::any_of("uncertainty")
) |>
dplyr::mutate(
.truth = as.numeric(.data$.truth),
.pred = as.numeric(.data$.pred)
)
dplyr::mutate(
.truth = as.numeric(.data$observed),
.pred = as.numeric(.data$expected)
) |>
dplyr::select(
dplyr::all_of(c(meta_cols, ".truth", "expected")),
dplyr::any_of("uncertainty")
)

Comment thread R/augment-fims.R
Comment on lines +120 to +123
dplyr::select(-"uncertainty")
} else if ("uncertainty" %in% names(out)) {
out <- dplyr::select(out, -"uncertainty")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the removal of uncertainty within the if statement and remove the else statement entirely and then just use `dplyr::select(-dplyr::any_of("uncertainty")) with the return statement?

Comment thread inst/WORDLIST
Comment on lines +168 to +169
optimised
optimized

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
optimised
optimized

Neither of these are needed with the cspell routine.

Comment thread inst/WORDLIST
rlang
rlib
rmarkdown
rmse

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rmse

cspell is case agnostic for stuff like this.

Comment thread R/augment-fims.R
#' @param x A `FIMSFit` object **or** an already-augmented tibble from
#' `augment.FIMSFit()`.
#' @param stream_label Character scalar. The value of the `label` column to
#' retain, e.g. `"landings_expected"`, `"index_expected"`,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' retain, e.g. `"landings_expected"`, `"index_expected"`,
#' retain, e.g., `"landings_expected"`, `"index_expected"`,

Comment thread R/augment-fims.R
#' `"agecomp_expected"`, or `"lengthcomp_expected"`. If `NULL` (default),
#' no filtering on label is done.
#' @param module_id Integer scalar. The `module_id` of the fleet or survey to
#' retain (e.g. `1` for the first fishing fleet, `2` for the first survey in

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' retain (e.g. `1` for the first fishing fleet, `2` for the first survey in
#' retain (e.g., `1` for the first fishing fleet, `2` for the first survey in

Comment thread R/fimsfit.R
# calling obj$gr() again — which segfaults after clear() frees C++ memory.
# as.numeric() is required because TMB's obj$gr() returns a matrix (1×n)
# in some call patterns, which would fail the "numeric" slot type check.
gradient_vec <- if (length(opt) > 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gradient_vec <- if (length(opt) > 0) {
gradient_vector <- if (length(opt) > 0) {

Comment thread R/fimsfit.R
NA_real_
rep(NA_real_, length(obj[["par"]]))
}
max_gradient <- if (length(opt) > 0) max(abs(gradient_vec)) else NA_real_

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_gradient <- if (length(opt) > 0) max(abs(gradient_vec)) else NA_real_
max_gradient <- if (length(opt) > 0) max(abs(gradient_vector)) else NA_real_

Comment thread R/fimsfit.R
obj = obj,
opt = opt,
max_gradient = max_gradient,
gradient = gradient_vec,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
gradient = gradient_vec,
gradient = gradient_vector,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants