Skip to content

filter-7z schema related problems. #32

@ramSeraph

Description

@ramSeraph

From @kushkamal84-eng at ramSeraph/indian_cadastrals#8 (reply in thread)

Just to clarify the workflow I followed:

I first ran infer-schema on the full archive:
Input:
D:\MP_cad\MPBhulekh_MP_Survey_Cadastrals.geojsonl.7z.001

Command:

uvx iomaps cli infer-schema ^
-i D:\MP_cad\MPBhulekh_MP_Survey_Cadastrals.geojsonl.7z.001 ^
-o D:\MP_cad\mp_full.schema.json ^
-g Polygon

This command completed successfully after a full scan of the archive
(~26 hours) and generated mp_full.schema.json.

I then used the same schema file (mp_full.schema.json) for filtering
the same input archive:
uvx iomaps cli filter-7z ^
-i D:\MP_cad\MPBhulekh_MP_Survey_Cadastrals.geojsonl.7z.001 ^
-o C:\MP_work\MP_full.gpkg ^
-b "73.3,18.8,84.5,28.9" ^
-s D:\MP_cad\mp_full.schema.json ^
-g Polygon ^
--no-clip

However, this fails with:

ValueError: Record does not match collection schema

So the schema was inferred by scanning the same input file first,
but while writing features during filter-7z, some records still do not
match the inferred schema (extra / missing properties).

This suggests that even a full-archive schema inference is not sufficient
to guarantee schema consistency during streaming writes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions