Skip to content

Expose Population proportion_female with default 0.5#1494

Open
szu-yun-ko wants to merge 10 commits into
NOAA-FIMS:mainfrom
szu-yun-ko:feat/prop-female
Open

Expose Population proportion_female with default 0.5#1494
szu-yun-ko wants to merge 10 commits into
NOAA-FIMS:mainfrom
szu-yun-ko:feat/prop-female

Conversation

@szu-yun-ko

@szu-yun-ko szu-yun-ko commented May 29, 2026

Copy link
Copy Markdown
Contributor

What is the feature?

How have you implemented the solution?

User / R layer

  • create_default_parameters(): added a single Population row (proportion_female = 0.5, estimation_type = "constant") so users can edit it with the usual dplyr/tidyr workflow.
  • initialize_module(): proportion_female is initialized via the standard set_param_vector() path, like log_M and log_init_naa. Unlike log_f_multiplier, it is not excluded from module_fields and does not require manual setup in the standard initialize_fims() workflow.
  • Exposed proportion_female on the Population Rcpp module (src/fims_modules.hpp). Direct access uses the ParameterVector API: pop$proportion_female[1]$value <- 0.35.

Rcpp

  • Added proportion_female as a ParameterVector on PopulationInterface. This resolves the live-object copy sync issue seen when it was a plain double.
  • Constructor sets slot [0] to 0.5 with estimation_type = "constant", so integration tests and manual builds that skip the tibble do not inherit Parameter()'s default of 0.
  • In add_to_fims_tmb_internal(), if proportion_female is unset or has invalid size, it is filled with a default scalar value of 0.5 (similar to log_f_multiplier defaulting to zeros). Valid sizes are 1 (scalar, recycled across ages) or n_ages (per-age values). Values are copied to the TMB Population vector without broadcasting to n_ages.

C++ layer

  • proportion_female remains a fims::Vector<Type> on fims_popdy::Population, defaulting to length 1 with value 0.5.
  • Spawning biomass and related calculations now use get_force_scalar(age) so a length-1 vector is recycled across all ages, while a length-n_ages vector supports per-age values.
  • Removed the Prepare() loop that forced proportion_female[age] = 0.5 on every run, which previously overwrote user-provided values.
  • Initialize() no longer resizes proportion_female to n_ages; it only ensures a default exists if the vector is empty.

Testing this PR

End-to-end test: verify that the value reaches the TMB model, not only the R interface.

library(FIMS)
library(dplyr)
data("data_big")
data_4_model <- FIMSFrame(data_big)
base_params <- data_4_model |>
  create_default_configurations() |>
  create_default_parameters(data = data_4_model) |>
  tidyr::unnest(cols = data)
low_female_params <- base_params |>
  mutate(value = if_else(label == "proportion_female", 0.35, value))
clear()
fit_default <- initialize_fims(base_params, data_4_model) |>
  fit_fims(optimize = FALSE, get_sd = FALSE)
clear()
fit_low <- initialize_fims(low_female_params, data_4_model) |>
  fit_fims(optimize = FALSE, get_sd = FALSE)

fit_default@report[["spawning_biomass"]][[1]]
fit_low@report[["spawning_biomass"]][[1]]

Does the PR impact any other area of the project, maybe another repo?

  • No.

Instructions for code reviewer

👋Hello reviewer👋, thank you for taking the time to review this PR!

  • Please use this checklist during your review, checking off items that you have verified are complete but feel free to skip over items that are not relevant!
  • See the GitHub documentation for how to comment on a PR to indicate where you have questions or changes are needed before approving the PR.
  • Please use standard conventional messages for both commit messages and comments
  • PR reviews are a great way to learn so feel free to share your tips and tricks. However, when suggesting changes to the PR that are optional please include nit: (for nitpicking) as the comment type. For example, nit: I prefer using a data.frame() instead of a matrix because ...
  • Engage with the developer. Make it clear when the PR is approved by selecting the approved status, and potentially commenting on the PR with something like This PR is now ready to be merged.

Checklist

  • The PR requests the appropriate base branch (dev for features and main for hot fixes)
  • The code is well-designed
  • The code is designed well for both users and developers
  • Code coverage remains high- [ ] Comments are clear, useful, and explain why instead of what
  • Code is appropriately documented (doxygen and roxygen)

@github-actions

Copy link
Copy Markdown
Contributor

Thank you for contributing to FIMS and opening your first PR here! We are happy to have your contributions. Please ensure that the PR is made to the dev branch and let us know if you need any help! Also, we encourage you to introduce yourself to the community on the introduction thread in our Discussions.

@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

🎨 Chore: code formatting workflow

Our automated workflows cannot run on forks because of permission issues, and thus, we ask that you run the following code locally and push any changes that are created to your feature branch. You will only be reminded of this once per PR. Thank you!

Format C++ code

  1. Install clang-format version 18.0.0
  2. Run the following command from the repository root:
    clang-format -i --style="{BasedOnStyle: Google, SortIncludes: false}" $(find ./inst/include ./src ./tests/gtest -name "*.hpp" -o -name "*.cpp")

Format R code

  1. Install {styler} and {roxygen2}
  2. Run the following commands in R from the repository root:
styler::style_pkg() # Style R code
roxygen2::roxygenise() # Update documentation
styler::style_pkg() # Style R code again
roxygen2::roxygenise() # Update documentation again
usethis::use_tidy_description() # Style DESCRIPTION file

Push changes

  1. Commit the formatting with a commit message of "Chore: format feature branch"
  2. Push to your fork

@szu-yun-ko

Copy link
Copy Markdown
Contributor Author

Hi @kellijohnson-NOAA, @nathanvaughan-NOAA, @Andrea-Havron-NOAA. I ran into some implementation questions when implementing this feature and I was hoping if you can point me in the correct direction:

The issue I hit is that PopulationInterface used to register a copy in live_objects (make_shared(*this)), while R keeps the original Population. That is fine for ParameterVector and SharedInt (shared storage on copy) but not for plain double: the tibble / $proportion_female updates hit the R object, while add_to_fims_tmb_internal() on the registered copy still saw 0.5. This PR works around that by registering this instead of a copy (rcpp_population.hpp lines 161–169).

Question: For a scalar proportion_female functional requirement in #521, which approach should I use?

  1. double + register this (current draft): scalar R API (pop$proportion_female <- 0.35), but less consistent with the other declarations.
  2. ParameterVector length 1: matches FIMS (log_M, inflection_point, etc.), no constructor fix needed. R access is pop$proportion_female[1]$value.
  3. Use SharedReal: SharedReal today seems to use the same implementation as SharedInt in rcpp_shared_primitive.hpp, so it cannot represent proportion_female.

I am still getting familiar with the TMB architecture and FIMS primitive data types, so please correct me if I am misunderstanding anything. I would appreciate any guidance on which of these approaches best fits the intended design, or whether there is another approach you would prefer.

Thanks!

@nathanvaughan-NOAA

Copy link
Copy Markdown
Contributor

Hey @szu-yun-ko looks like you are making good progress digging into the architecture. Changing proportion_female to a ParameterVector to match the other inputs seems like the best approach. I would suggest setting it up similar to other parameters that default to a scalar value using but can be extended to be time-varying. This shouldn't be any more difficult than using year instead of 1 when referencing the value because fims vectors include a get_force_scalar function to recycle scalar values for vectors. If you set a default 0.5 value that will enable the user to not specify the value if they choose and also allow all the existing examples/tests to function the same. You can look at the setup for log_f_multiplier which is filled with default values if you hit any snags, I'm not sure if we have any other parameters with internal default values.

@szu-yun-ko szu-yun-ko changed the title Expose Population proportion_female as an R scalar with default 0.5 Expose Population proportion_female with default 0.5 Jun 3, 2026
@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 67.08861% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.33%. Comparing base (4a2d5f3) to head (8560af2).

Files with missing lines Patch % Lines
...de/interface/rcpp/rcpp_objects/rcpp_population.hpp 33.33% 24 Missing ⚠️
inst/include/models/functors/catch_at_age.hpp 83.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1494      +/-   ##
==========================================
- Coverage   87.53%   87.33%   -0.20%     
==========================================
  Files          97       97              
  Lines        8749     8817      +68     
  Branches      523      520       -3     
==========================================
+ Hits         7658     7700      +42     
- Misses       1054     1079      +25     
- Partials       37       38       +1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@szu-yun-ko szu-yun-ko marked this pull request as ready for review June 5, 2026 08:13
@szu-yun-ko

Copy link
Copy Markdown
Contributor Author

@kellijohnson-NOAA @awilnoaa I think this is ready for review now. At one point I ran into an issue that seemed to be the root cause of most failing CIs, which I want to briefly document in this comment:

On main, Prepare() forced proportion_female[age] = 0.5 every evaluation, which masked missing initialization. Once we removed that loop so user values are respected, manual builds and integration tests that call methods::new(Population) without a parameters tibble were left with Parameter()’s default initial_value_m of 0, which propagated into spawning biomass calculations as NaNs. This was fixed by setting a default in the PopulationInterface constructor:

this->proportion_female[0].initial_value_m = static_cast<double>(0.5);
this->proportion_female[0].estimation_type_m.set("constant");

That gives integration tests and manual builds a safe fallback (0.5) without reintroducing the Prepare() overwrite. The standard initialize_fims() path still overwrites this via set_param_vector() when the user edits the parameters tibble. After the fix most CI passed.

Please let me know if there's anything I'm missing to address, and thanks for the review!

Comment thread inst/include/interface/rcpp/rcpp_objects/rcpp_population.hpp
population->proportion_female.resize(this->proportion_female.size());
}

for (size_t i = 0; i < this->proportion_female.size(); i++) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, but should proportion_female be bounded to [0, 1] here or earlier in the wrapper? I think with this setup users can pass values outside of [0, 1]?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nathanvaughan-NOAA, I was reviewing this and wanted to get your thoughts. If proportion_female can support fixed and random effects, should it also be bounded to [0,1] throughout estimation, in addition to checking the initial value?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to error out if a user assigns it to a fixed effect or a random effect because we have no data to estimate the value right now.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @awilnoaa the [0,1] bounding should be accomplished by using a logit transform of the input value, the same way we use a log transform for things like F_mort to constrain them as positive. @kellijohnson I feel like it's probably more future-proof to leave the potential for estimability than to set a rigid error, there are plenty of things like M or the length at age matrix that aren't practically estimable but are still technically allowed to be estimated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced the hard-errors with warnings for proportion_female's bounding range.

@awilnoaa awilnoaa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eveything looks good, I left two comments on rcpp_population.hpp. I also wanted to ask if proportion_female should be added to rcpp_models.hpp so it shows up in standard output for users to inspect after fitting?

@szu-yun-ko szu-yun-ko force-pushed the feat/prop-female branch 2 times, most recently from c306f71 to a7c95ce Compare June 9, 2026 03:52

@kellijohnson-NOAA kellijohnson-NOAA left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some more comments. Additionally, I think we need a test in the testthat tests, demonstrating that if we reduce the proportion female that spawning biomass decreases compared to using a higher value or the default.

*/
ParameterVector log_init_naa;
/**
* @brief Proportion female in the population.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More information here would be great.

Comment thread inst/include/interface/rcpp/rcpp_objects/rcpp_population.hpp
population->proportion_female.resize(this->proportion_female.size());
}

for (size_t i = 0; i < this->proportion_female.size(); i++) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to error out if a user assigns it to a fixed effect or a random effect because we have no data to estimate the value right now.

wire it through default parameters and initialize_module, and broadcast
to the per-age C++ vector at TMB build time.
Switch from a plain double to ParameterVector on PopulationInterface so
values stay in sync with live_objects on copy. Wire through set_param_vector,
fill a scalar default when unset, and use get_force_scalar(age) in spawning
biomass calculations to support scalar or per-age inputs.
Accept the expected default parameter table after adding the
proportion_female Population row with default 0.5.
Broadcast a constant length-1 R value to n_ages at TMB build time,
default empty vectors to n_ages in Initialize(), and initialize
proportion_female in gtest fixtures now that Prepare() no longer resets it.
Follow reviewer guidance: keep length-1 or n_ages on the model side,
default missing values in add_to_fims like log_f_multiplier, and use
get_force_scalar(age) instead of broadcasting constants to n_ages.
Manual methods::new(Population) builds left proportion_female at
Parameter()'s initial_value_m of 0. Integration tests use that path
and no longer rely on main's Prepare() loop to force 0.5 each eval.
@kellijohnson-NOAA

Copy link
Copy Markdown
Contributor

There are two failed tests in the testthat tests, see the output here. Let us know if you need help fixing them.

@szu-yun-ko

Copy link
Copy Markdown
Contributor Author

Thanks for the review! I should be able to address the comments and implement fixes to pass the test before next Monday.

@szu-yun-ko

Copy link
Copy Markdown
Contributor Author

I think this is ready for review again! I've made some changes to address the comments, and the testthat tests now passes. I'm not sure why build-metrics, run-slow-tests and the MacOS x86_64 test failed, but from the error logs I feel like it might not be caused by changes in this PR. I've also added a testthat test to check that if we reduce the proportion female then spawning biomass decreases compared to using a higher value or the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants