Skip to content

respecify length bins #31

@sgaichas

Description

@sgaichas

We started with equally spaced length bins throughout the range of observed lengths across fishery and survey data for each species. This made inefficient use of data (smallest and largest bins often missing data).

Options discussed in January 2023 include

  • Start from min observed, not 0 for bin 1
  • Combine bins 1-2? Finer structure thereafter? Bigger last bin? Plus group

A new bin definition algorithm implemented in hydradata first calculates quantiles for each species based on all input lengths aggregated over time. The current implementation uses the smaller of the survey or fishery 10%ile as the minimum size for bin width definition, and the larger of the survey or fishery 90%ile as the maximum size for bin width definition. Equal bin widths within this range are calculated, and then the first and last bin are extended to include 0 for the smallest and the max observed length for the largest bin.

A visualization of bin definitions (black vertical lines) for each species and aggregate dataset is below, based on the current (January 2023) mskeyrun Georges Bank dataset and 5 length bins:

image

image

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions