We may want to indicate how to generate IDs for the triples corresponding to the rows in CSV files.
This will facilitate having a well defined mapping from DSPL 2 datasets to triples, and may make it feasible to use dimension values and footnotes defined in CSV files across datasets.
Tentative proposal
Attempt to generate easy-to-keep-unique IDs, and make no provisions for ID collisions.
codeList
For each CSV row,
- Start with the containing dimension's ID.
- If there is no fragment, set the fragment to the dimension's
name, URL encoded.
- Append an
= and the URL-encoded codeValue to the fragment.
For example, if a row's codeValue is us and its containing Dimension has @id of #country, the row's triples should be generated as if from equivalent JSON-LD with "@id": "#country=us".
footnote
For each CSV row,
- Start with the containing
StatisticalDataset's @id.
- If there is a fragment, append a
/
- Append
footnote= and the URL-encoded codeValue to the fragment
For example, if the dataset's @id is the empty string, a footnote with codeValue of p would yield an ID of #footnote=p. Similarly, if the dataset @id is #my_dataset, the footnote would have @id of #my_dataset/footnote=p.
observation
For each CSV row,
- Start with the slice's
@id.
- If there is a fragment, append a
/ to it.
- Sort the dimension values by dimension
name.
- For each dimension value, append the URL-encoded
name, = and the URL-encoded codeValue to the fragment, separating the entries with /.
- Sort the measure values by measure name
- For each measure value, append the URL-encoded
name to the fragment, separating entries with /.
For example, an observation in a slice with an @id of #europe_unemployment_slice with dimensions
gender of m,
country of uk, and
month of 2010-10
and measures
unemployment_rate and
unemployment
would have an @id of #europe_unemployment_slice/country=uk/gender=m/month=2010-10/unemployment/unemployment_rate
We may want to indicate how to generate IDs for the triples corresponding to the rows in CSV files.
This will facilitate having a well defined mapping from DSPL 2 datasets to triples, and may make it feasible to use dimension values and footnotes defined in CSV files across datasets.
Tentative proposal
Attempt to generate easy-to-keep-unique IDs, and make no provisions for ID collisions.
codeList
For each CSV row,
name, URL encoded.=and the URL-encodedcodeValueto the fragment.For example, if a row's
codeValueisusand its containing Dimension has@idof#country, the row's triples should be generated as if from equivalent JSON-LD with"@id": "#country=us".footnote
For each CSV row,
StatisticalDataset's@id./footnote=and the URL-encodedcodeValueto the fragmentFor example, if the dataset's
@idis the empty string, a footnote withcodeValueofpwould yield an ID of#footnote=p. Similarly, if the dataset@idis#my_dataset, the footnote would have@idof#my_dataset/footnote=p.observation
For each CSV row,
@id./to it.name.name,=and the URL-encodedcodeValueto the fragment, separating the entries with/.nameto the fragment, separating entries with/.For example, an observation in a slice with an
@idof#europe_unemployment_slicewith dimensionsgenderofm,countryofuk, andmonthof2010-10and measures
unemployment_rateandunemploymentwould have an
@idof#europe_unemployment_slice/country=uk/gender=m/month=2010-10/unemployment/unemployment_rate