Clarification on Bounding Box Normalization in SFT/RL sets

Thank you for your impressive work on Treevgr. 
I am currently exploring the provided SFT and RL datasets and have a quick technical question regarding the bounding box (bbox) format:

1. Are the bbox values (format as [x1, x2, y1, y2] ?) normalized to a [0, 1000] or [0, 1024] range, or do they represent absolute pixel coordinates from the original images?
2. Or could you provide a brief code snippet or a reference to the specific script in the repository that demonstrates how these boxes are loaded and processed during training?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Bounding Box Normalization in SFT/RL sets #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Clarification on Bounding Box Normalization in SFT/RL sets #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions