Skip to content

Support reading sharded files #43

@klamike

Description

@klamike

PGLearn is hosted on HF which has 50GB file limits. For the medium/large cases, some files exceed this limit, so splits are pushed instead, i.e. SOCOPF/dual.h5 is replaced by a folder SOCOPF/dual containing files xaa xab which when cat'd together give the original SOCOPF/dual.h5. ML4OPF should merge these automatically upon first read (like how the HF dataset loading script does here)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions