Name cleanup#148
Merged
Merged
Conversation
These four classes were added to the D4D schema after the original
semantic exchange layer was authored, leaving them without RO-Crate
mappings. This commit closes that gap.
Semantic SSSOM (src/data_sheets_schema/alignment/d4d_rocrate_sssom_mapping.tsv):
+12 rows (95 → 107)
- DatasetCollection → schema:Dataset (exactMatch, RO-Crate root)
- DatasetCollection → dcat:Catalog (closeMatch, semantic-catalog view)
- File → schema:MediaObject (exactMatch)
- File → schema:DigitalDocument (closeMatch)
- FileCollection → schema:Dataset (exactMatch, nested in hasPart)
- FileCollection → dcat:Distribution (closeMatch)
- 6 key-slot rows: DatasetCollection.resources/FileCollection.resources →
schema:hasPart, File.file_type → d4d:fileType, FileCollection.{collection_type,
file_count, total_bytes} → d4d:collectionType / d4d:fileCount / dcat:byteSize
Structural SSSOM (data/mappings/d4d_rocrate_structural_mapping.sssom.tsv):
+6 rows (149 → 155) — slot-level rows mirroring the semantic-file slots
SKOS alignment (src/data_sheets_schema/alignment/d4d_rocrate_skos_alignment.ttl):
- Added dcat: prefix declaration
- Added 6 class-level + 6 slot-level skos triples mirroring the SSSOM rows
Per the user's note that DatasetCollection may be the RO-Crate root
(@type=["Dataset", "https://w3id.org/EVI#ROCrate"], @id="./"),
DatasetCollection is given a dual mapping: exactMatch → schema:Dataset
(root semantics) and closeMatch → dcat:Catalog (semantic-catalog view).
Out of scope for this PR (existing TODOs remain):
- src/fairscape_integration/d4d_to_fairscape.py:292-295 — converter
code does not yet traverse FileCollection.resources to emit RO-Crate
File entities. The mapping layer is now ready; converter update is
a separate follow-up.
- The generated comprehensive/uri SSSOM variants weren't regenerated;
the canonical files (semantic + structural) are the source of truth.
Validation:
- SSSOMIntegration parses both files (semantic via custom reader,
structural via sssom-py per the existing column-naming setup)
- All 190 tests in tests/test_alignment + tests/test_fairscape_integration pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A reusable Claude Code slash command that captures the workflow used in this PR — adding D4D ↔ RO-Crate / FAIRSCAPE mappings for new schema classes. The skill: - Describes the 19-column semantic SSSOM and 17-column structural SSSOM layouts and points at the canonical files - Provides a decision rubric for choosing primary/secondary RO-Crate targets based on class_uri / exact_mappings / tree_root annotations - Includes row templates and a Python helper-script skeleton - Documents standard RO-Crate target conventions (root Dataset, schema:MediaObject, dcat:Catalog, schema:hasPart, etc.) - Specifies the mandatory validation step via SSSOMIntegration + pytest - Codifies branch / commit / PR conventions - Calls out known follow-ups to keep out of scope (converter TODOs, generator regen, schema YAML touch-ups) Cross-references PR #147 as the canonical worked example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generated from the D4D ↔ RO-Crate semantic SSSOM by parsing rocrate_json_path patterns to extract entity types and their properties. Shows: - Dataset (root) with properties grouped by namespace (schema.org, DCAT, FAIRSCAPE EVI, Croissant RAI, D4D-specific) - Sub-entities: MediaObject, Person, Organization, Grant, CreativeWork, DefinedTerm - Reference edges (author/creator/contributor → Person, funder → Grant, publisher → Organization, citation → CreativeWork, about → DefinedTerm, hasPart → MediaObject) - ROCrate as root marker connected via dashed @type edge Generator: src/alignment/ (helper script captured in /tmp during this PR); rendered with graphviz dot -Gdpi=180. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-class side-by-side comparison of slot counts in the d4d-core semantic exchange layer (left, orange) versus mapped/standard RO-Crate properties on the corresponding target type (right, green). Right-side counts combine SSSOM-discovered properties with the schema.org / RO-Crate 1.1 baseline for sub-entity types (Person, Organization, Grant, MediaObject, Distribution). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… site coverage - src/data_sheets_schema/alignment/ → src/data_sheets_schema/semantic_exchange/ (canonical SKOS TTL + semantic SSSOM artifacts) - data/mappings/ → data/semantic_exchange/ (sssom-py-compatible structural mapping + analysis docs) - src/alignment/ → src/semantic_exchange/ (generator scripts) - tests/test_alignment/ → tests/test_semantic_exchange/ Updated all path references in Makefile, generator scripts, schema YAMLs, fairscape_integration, notes, and tests. All 190 tests pass. Visibility improvements: - README.md: new "D4D-Core Schema" + "Semantic Exchange Layer" sections with per-artifact path tables - docs/home.md: top-level pointers to D4D-Core and Semantic Exchange - docs/d4d_core.md: new hand-curated landing page for the core schema (artifacts, build/validate targets, curated example datasheets, class crosswalk, rationale) - docs/semantic_exchange.md: new hand-curated landing page for the exchange layer (canonical artifacts, generator scripts, validation, /d4d-add-mapping workflow, namespaces, coverage stats) - mkdocs.yml: added "D4D-Core" and "Semantic Exchange" to nav Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously the chart only covered 8 hand-listed structural classes. Now it shows every d4d-core class, sorted by slot count, in a two-column layout with poster-friendly aspect (~1.84). Right-side counts: - Structural targets (Dataset/Distribution/Person/Org/Grant/etc.): full property surface (SSSOM-discovered + schema.org baseline) - Property/wrapper classes: derived by looking up which slots have the class as range, then checking the SKOS TTL for mapped targets Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- SSSOM subject_id values for the 6 new key-slot rows now use the underscore form (d4d:Class_slot) to match the SKOS TTL subjects and what generate_sssom_mapping.py emits, so downstream lookups via SSSOMIntegration.get_mappings_by_subject() resolve correctly. - SSSOM header refreshed: '# Total mappings: 107' (was 95) and '# Date: 2026-04-26'. - SKOS TTL header bumped to Version 1.1 / Date 2026-04-26 and the alignment-statistics block updated to reflect the current 112 triples (69 exact / 25 close / 10 related / 7 narrow / 1 broad) and the per-namespace counts (schema.org 57, rai 29, d4d 10, evi 7, dcat 3, rdf 1). Tests: 190 passed, 2 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The skill doc still pointed contributors at the pre-rename paths (src/data_sheets_schema/alignment/, data/mappings/, src/alignment/, tests/test_alignment/) so its grep, git-add, and validation snippets no longer matched the canonical files. Repointed every reference to the renamed directories. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The data/semantic_exchange/ directory had grown to seven SSSOM TSV
copies, several of which were stale snapshots or byte-identical
duplicates of the canonical files in
src/data_sheets_schema/semantic_exchange/. Two of them were a v1/v2
pair that were impossible to interpret without comparing dates by
hand (v1 was the newer 2026-04-09 / 284-attr file; v2 was a stale
2026-03-23 / 268-attr file).
Deleted from data/semantic_exchange/:
- d4d_rocrate_sssom_mapping.tsv (stale 102-row snapshot)
- d4d_rocrate_sssom_mapping_subset.tsv (duplicate of src/)
- d4d_rocrate_sssom_comprehensive.tsv (duplicate of src/)
- d4d_rocrate_sssom_uri_mapping.tsv (duplicate of src/)
- d4d_rocrate_sssom_uri_comprehensive_v1.tsv (duplicate of src/'s
canonical
d4d_rocrate_sssom_uri_comprehensive.tsv)
- d4d_rocrate_sssom_uri_comprehensive_v2.tsv (stale older snapshot)
- d4d_rocrate_sssom_uri_interface.tsv (orphan; not referenced
anywhere in code or Make)
Kept in data/semantic_exchange/ (canonical here):
- d4d_rocrate_structural_mapping.sssom.tsv
- d4d_rocrate_structural_mapping_summary.md
- STRUCTURAL_MAPPING_ANALYSIS.md
- uri_mapping_recommendations.md
- README.md (rewritten to point at src/.../semantic_exchange/ for
everything except the structural mapping)
Updated tests/test_semantic_exchange/test_sssom_validation.py to
look up comprehensive / uri / uri_comprehensive in the canonical
src/ tree instead of the deleted data/ copies. Tests: 190 passed,
2 skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conflict resolution: kept the name_cleanup intent — canonical SSSOMs in src/data_sheets_schema/semantic_exchange/ only; data/semantic_exchange/ keeps just the structural mapping (sssom-py compatible) and analysis docs. - Confirmed deletion of duplicate / stale TSVs from data/semantic_exchange/ that main had recreated as part of its rename of data/mappings/ → data/semantic_exchange/ - Kept HEAD's lean README.md (points at canonical src/.../semantic_exchange/ for everything except the structural mapping) over main's older "D4D Mapping Files" version that referenced v1/v2/uri_interface - Resolved test_sssom_validation.py to use src_dir for comprehensive, uri, and uri_comprehensive lookups src/.../semantic_exchange/*.tsv files are byte-identical on both sides; accepted ours. Tests: 190 passed, 2 skipped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR consolidates the D4D ↔ RO-Crate/FAIRSCAPE “semantic exchange” artifacts under semantic_exchange/ (replacing older alignment/ + data/mappings/ locations), updates generator defaults and downstream consumers, and adds user-facing documentation for D4D-Core + the exchange layer.
Changes:
- Move/standardize paths for SKOS + SSSOM artifacts to
src/data_sheets_schema/semantic_exchange/anddata/semantic_exchange/, updating tests, CLI defaults, docs, and Make targets accordingly. - Add new documentation pages (
d4d_core.md,semantic_exchange.md) and update MkDocs navigation + repo README. - Introduce helper tooling/docs for maintaining mappings (e.g.,
add_slot_uris.py,/d4d-add-mappingcommand doc).
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_semantic_exchange/test_sssom_validation.py | Update tests to new semantic_exchange artifact locations and include both data/ + src/ TSV sources. |
| tests/test_semantic_exchange/init.py | Add package docstring for semantic exchange tests. |
| tests/test_fairscape_integration/test_sssom_reader.py | Update structural mapping path to data/semantic_exchange/.... |
| tests/test_fairscape_integration/test_sssom_integration.py | Update structural mapping path to data/semantic_exchange/.... |
| src/semantic_exchange/implement_uri_mappings.py | Update docstring paths/usages to semantic_exchange locations. |
| src/semantic_exchange/generate_structural_mapping.py | Change default output directory to data/semantic_exchange. |
| src/semantic_exchange/generate_sssom_uri_mapping.py | Update default SKOS/output paths to semantic_exchange directory. |
| src/semantic_exchange/generate_sssom_mapping.py | Update default SKOS/output paths to semantic_exchange directory. |
| src/semantic_exchange/generate_comprehensive_sssom_uri.py | Update default SKOS/output paths to semantic_exchange directory. |
| src/semantic_exchange/generate_comprehensive_sssom.py | Update default SKOS/output paths to semantic_exchange directory. |
| src/semantic_exchange/add_slot_uris.py | Add a new helper script to apply slot_uri recommendations to schema YAMLs. |
| src/semantic_exchange/add_module_column.py | Update mappings output directory to data/semantic_exchange. |
| src/fairscape_integration/fairscape_to_d4d.py | Update default semantic SSSOM mapping path to semantic_exchange directory. |
| src/fairscape_integration/README_STANDARD_TOOLING.md | Update examples to semantic_exchange paths (one path still incorrect; see comments). |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_uri_mapping.tsv | Add URI-level SSSOM mapping file under canonical semantic_exchange path. |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_sssom_mapping_subset.tsv | Add subset semantic SSSOM TSV under canonical semantic_exchange path. |
| src/data_sheets_schema/semantic_exchange/d4d_rocrate_skos_alignment.ttl | Update alignment TTL (prefixes, version/date, added class/slot triples, updated stats). |
| src/data_sheets_schema/schema/data_sheets_schema_core_all.yaml | Update see_also reference to new SKOS TTL location. |
| src/data_sheets_schema/schema/data_sheets_schema_core.yaml | Update see_also reference to new SKOS TTL location. |
| src/data_sheets_schema/schema/D4D_Core.yaml | Update see_also reference to new SKOS TTL location. |
| src/data_sheets_schema/alignment/d4d_rocrate_sssom_uri_mapping.tsv | Remove old URI SSSOM mapping from deprecated alignment directory. |
| src/data_sheets_schema/alignment/d4d_rocrate_sssom_mapping_subset.tsv | Remove old subset SSSOM mapping from deprecated alignment directory. |
| src/data_sheets_schema/alignment/d4d_rocrate_sssom_mapping.tsv | Remove old semantic SSSOM mapping from deprecated alignment directory. |
| notes/SEMANTIC_EXCHANGE_IMPLEMENTATION.md | Update references to the new SKOS TTL path. |
| mkdocs.yml | Add nav entries for D4D-Core and Semantic Exchange docs pages. |
| docs/semantic_exchange.md | Add user-facing documentation for artifacts, generators, validation, and workflow. |
| docs/home.md | Add entry points/links for D4D-Core and Semantic Exchange docs. |
| docs/d4d_core.md | Add user-facing documentation for the D4D-Core schema subset. |
| data/semantic_exchange/uri_mapping_recommendations.md | Add URI mapping recommendation document under new directory. |
| data/semantic_exchange/d4d_rocrate_structural_mapping_summary.md | Add structural mapping summary under new directory. |
| data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv | Add/update structural mapping with additional rows for newly mapped classes/slots. |
| data/semantic_exchange/STRUCTURAL_MAPPING_ANALYSIS.md | Update structural mapping analysis to new script path + output locations. |
| data/semantic_exchange/README.md | Add README describing what belongs in data/semantic_exchange/. |
| data/poster_assets/figures/fig7_rocrate_profile.dot | Add DOT diagram source for RO-Crate profile figure. |
| data/mappings/d4d_rocrate_sssom_uri_interface.tsv | Remove old interface URI mapping file from deprecated directory. |
| data/mappings/README.md | Remove old mappings README from deprecated directory. |
| README.md | Document D4D-Core and Semantic Exchange as first-class entry points and update repo structure overview. |
| Makefile | Repoint SSSOM/SKOS generator variables and outputs to semantic_exchange paths. |
| .claude/commands/d4d-add-mapping.md | Add a new “add mapping” command doc describing the workflow to extend exchange layer mappings. |
| .claude/commands/README.md | Register /d4d-add-mapping in the commands index. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
realmarcin
added a commit
that referenced
this pull request
Apr 29, 2026
Conflict resolution: - Canonical SSSOM/SKOS files at src/data_sheets_schema/semantic_exchange/: kept ours (114-row mapping, plus 7 d4d-core class additions on top of the 107-row baseline that PR #147 already shipped, plus expanded SKOS TTL). - Mapping TSVs duplicated under data/semantic_exchange/: deleted. PR #148 (Name cleanup) already moved them to the canonical src/data_sheets_schema/semantic_exchange/ location. - Poster figures added by main (fig7_rocrate_profile.{dot,png}, fig8_exchange_butterfly.png): removed per project rule that poster artifacts don't get committed here. - README + test_sssom_validation.py: took main's version (correctly reflects the post-#148 structural/canonical split). - docs/html_output/concatenated/curated/*.html re-rendered from current renderer + curated YAMLs (generated, not hand-merged). - data/semantic_exchange/d4d_rocrate_structural_mapping.sssom.tsv: kept ours (superset of main). Tests: tests.test_semantic_exchange.test_sssom_validation passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.