Skip to content

Clarification on Bounding Box Normalization in SFT/RL sets #13

@MLLMrgb

Description

@MLLMrgb

Thank you for your impressive work on Treevgr.
I am currently exploring the provided SFT and RL datasets and have a quick technical question regarding the bounding box (bbox) format:

  1. Are the bbox values (format as [x1, x2, y1, y2] ?) normalized to a [0, 1000] or [0, 1024] range, or do they represent absolute pixel coordinates from the original images?
  2. Or could you provide a brief code snippet or a reference to the specific script in the repository that demonstrates how these boxes are loaded and processed during training?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions