Skip to content
This repository was archived by the owner on Jan 10, 2023. It is now read-only.
This repository was archived by the owner on Jan 10, 2023. It is now read-only.

id generation for data/metadata in CSV files #9

@nkrishnaswami

Description

@nkrishnaswami

We may want to indicate how to generate IDs for the triples corresponding to the rows in CSV files.

This will facilitate having a well defined mapping from DSPL 2 datasets to triples, and may make it feasible to use dimension values and footnotes defined in CSV files across datasets.

Tentative proposal

Attempt to generate easy-to-keep-unique IDs, and make no provisions for ID collisions.

codeList

For each CSV row,

  • Start with the containing dimension's ID.
  • If there is no fragment, set the fragment to the dimension's name, URL encoded.
  • Append an = and the URL-encoded codeValue to the fragment.

For example, if a row's codeValue is us and its containing Dimension has @id of #country, the row's triples should be generated as if from equivalent JSON-LD with "@id": "#country=us".


footnote

For each CSV row,

  • Start with the containing StatisticalDataset's @id.
  • If there is a fragment, append a /
  • Append footnote= and the URL-encoded codeValue to the fragment

For example, if the dataset's @id is the empty string, a footnote with codeValue of p would yield an ID of #footnote=p. Similarly, if the dataset @id is #my_dataset, the footnote would have @id of #my_dataset/footnote=p.


observation

For each CSV row,

  • Start with the slice's @id.
  • If there is a fragment, append a / to it.
  • Sort the dimension values by dimension name.
  • For each dimension value, append the URL-encoded name, = and the URL-encoded codeValue to the fragment, separating the entries with /.
  • Sort the measure values by measure name
  • For each measure value, append the URL-encoded name to the fragment, separating entries with /.

For example, an observation in a slice with an @id of #europe_unemployment_slice with dimensions

  • gender of m,
  • country of uk, and
  • month of 2010-10

and measures

  • unemployment_rate and
  • unemployment

would have an @id of #europe_unemployment_slice/country=uk/gender=m/month=2010-10/unemployment/unemployment_rate

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions