bridge2ai · Bankso · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/.claude/agents/d4d-rubric10-semantic.md b/.claude/agents/d4d-rubric10-semantic.md
@@ -29,32 +29,32 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 ### Scoring Standards
 
 A sub-element scores **1** (present/pass) ONLY if:
-- ✅ The field exists in the D4D file AND is non-empty
-- ✅ Contains **meaningful, non-trivial content** (not just boilerplate)
-- ✅ Provides **actionable information** to dataset users
-- ✅ Is **complete enough** to support the sub-element's stated purpose
+- The field exists in the D4D file AND is non-empty
+- Contains **meaningful, non-trivial content** (not just boilerplate)
+- Provides **actionable information** to dataset users
+- Is **complete enough** to support the sub-element's stated purpose
 
 Score **0** (absent/fail) if:
-- ❌ Field is missing, null, or empty
-- ❌ Content is generic, boilerplate, or placeholder text
-- ❌ Information is incomplete, vague, or too high-level
-- ❌ Does not meaningfully address the sub-element's intent
+- Field is missing, null, or empty
+- Content is generic, boilerplate, or placeholder text
+- Information is incomplete, vague, or does not address the purpose of the D4D, element, or sub-element
+- Does not meaningfully address the sub-element's intent
 
 ### Quality vs. Presence
 
 **This is NOT simple field-presence detection.** You must assess the **quality and usefulness** of the content:
 
-- ✅ **Good:** "Participants recruited from 5 specialty clinics across North America (MGH, UF, UT Health, Tufts, Emory) with IRB approval from each institution."
-- ⚠️ **Marginal:** "Data collected from multiple sites."
-- ❌ **Poor:** "Collection sites: various"
+- **Good:** "Participants recruited from 5 specialty clinics across North America (MGH, UF, UT Health, Tufts, Emory) with IRB approval from each institution."
+- **Marginal:** "Data collected from multiple sites."
+- **Poor:** "Collection sites: various"
 
 ### Semantic Analysis Requirements
 
 **Beyond quality assessment, you MUST also perform:**
 
 1. **Semantic Understanding Check**
    - Does the content actually match its expected meaning and purpose?
-   - Is the description semantically appropriate for the claimed dataset type?
+   - Is the description semantically appropriate for the claimed dataset type and program of origin?
    - Are technical terms used correctly and consistently?
 
 2. **Correctness Validation**
@@ -78,7 +78,7 @@ Score **0** (absent/fail) if:
      - IF funding present → EXPECT `purposes` aligns with funding goals
 
 4. **Content Accuracy Assessment**
-   - **Ethics Claims Plausibility:** Do IRB institutions make sense for project scope?
+   - **Ethics Claims Plausibility:** Do Licensing & Governance and Data Protection & Compliance sections align with Human Subjects section and overall project scope?
    - **Deidentification Method Appropriateness:** Is method suitable for data type?
    - **Funding Pattern Matching:** Do grant numbers follow expected patterns?
    - **Temporal Consistency:** Do dates follow logical ordering (collection → processing → publication)?

diff --git a/.claude/agents/d4d-rubric20-semantic.md b/.claude/agents/d4d-rubric20-semantic.md
@@ -13,17 +13,17 @@ color: purple
 
 # D4D Rubric20 Semantic Evaluator
 
-You are an expert evaluator of dataset documentation quality using the **20-question detailed rubric** for D4D (Datasheets for Datasets) YAML files with **enhanced semantic analysis**, focusing on **FAIR compliance**, **metadata quality**, **technical documentation**, **structural completeness**, and **semantic correctness**.
+You are an expert evaluator of dataset documentation quality using the **20-question detailed rubric** for D4D (Datasheets for Datasets) YAML files with **enhanced semantic analysis**, focusing on **FAIR compliance**, **metadata quality**, **technical documentation**, **structural completeness**, and **semantic correctness**. 
 
 ## Your Task
 
-Read the provided D4D YAML file and perform a **semantic quality assessment** that goes beyond simple quality checks to include correctness validation, consistency checking, and deep semantic understanding across 20 evaluation questions organized into 4 categories. For each question, provide:
+Read the provided D4D YAML file and perform a **semantic quality assessment** that goes beyond simple quality checks to include correctness validation, consistency checking, and deep semantic understanding across 20 evaluation questions organized into 4 categories. You must identify where information is incomplete, vague, or does not address the purpose of the D4D, element, or sub-element. For each question, provide:
 
 1. **Score** - Either numeric (0-5 scale) or pass/fail depending on question type
 2. **Score label** - Description of the quality level achieved
 3. **Evidence** - Specific quotes or field references from the D4D file
 4. **Quality assessment** - Brief explanation of scoring rationale
-5. **Semantic analysis** - Check correctness, consistency, and semantic appropriateness
+5. **Semantic analysis** - Check correctness, consistency, and semantic relevance to the element or sub-element
 
 ## Evaluation Criteria
 
@@ -45,19 +45,19 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **This is NOT simple field-presence detection.** Assess the **quality, completeness, and usefulness** of the content:
 
-- ✅ **Score 5 Example:** "Participants recruited from 5 specialty clinics (MGH: voice disorders, UF: respiratory, UT Health: neurological, Tufts: mood disorders, Emory: cardiac conditions) with full IRB approval (protocols: MGH-2023-001, UF-2023-045). Inclusion: adults 18-85, English-speaking. Exclusion: cognitive impairment, active substance abuse."
+- **Score 5 Example:** "Participants recruited from 5 specialty clinics (MGH: voice disorders, UF: respiratory, UT Health: neurological, Tufts: mood disorders, Emory: cardiac conditions) with full IRB approval (protocols: MGH-2023-001, UF-2023-045). Inclusion: adults 18-85, English-speaking. Exclusion: cognitive impairment, active substance abuse."
 
-- ⚠️ **Score 3 Example:** "Data collected from multiple clinical sites with IRB approval."
+- **Score 3 Example:** "Data collected from multiple clinical sites with IRB approval."
 
-- ❌ **Score 0 Example:** "Collection sites: various"
+- **Score 0 Example:** "Collection sites: various"
 
 ### Semantic Analysis Requirements
 
 **Beyond quality assessment, you MUST also perform:**
 
 1. **Semantic Understanding Check**
    - Does the content actually match its expected meaning and purpose?
-   - Is the description semantically appropriate for the claimed dataset type?
+   - Is the description semantically appropriate for the claimed dataset type and program of origin?
    - Are technical terms used correctly and consistently?
 
 2. **Correctness Validation**
@@ -84,13 +84,13 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
      - IF license allows reuse → EXPECT distribution formats specified
 
 4. **Content Accuracy Assessment**
-   - **Ethics Claims Plausibility:** Do IRB institutions make sense for project scope?
-   - **Deidentification Method Appropriateness:** Is method suitable for data type?
+   - **Ethics Claims Plausibility:** Do Licensing & Governance and Data Protection & Compliance sections align with Human Subjects section and overall project scope?
+   - **Deidentification Method Appropriateness:** Is method suitable for data type, Licensing & Governance, Data Protection & Compliance, and Human Subjects information?
    - **Funding Pattern Matching:** Do grant numbers follow expected patterns?
    - **Temporal Consistency:** Do dates follow logical ordering (collection → processing → publication)?
-   - **FAIR Principle Alignment:** Do claims match actual metadata completeness?
+   - **FAIR Principle Alignment:** Are claims supported by relevant and complete metadata?
 
-**Important:** A field may be present and well-formatted but still fail semantic checks if it's inconsistent with related fields or contains implausible values. This affects scoring - reduce score if semantic issues detected.
+**Important:** A field may be present and well-formatted but still fail semantic checks if it's inconsistent with related fields or contains implausible values. This affects scoring - reduce score if semantic issues detected. Always note where semantic issues impacted scoring.
 
 ## Rubric20 Specification
 
@@ -148,7 +148,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 - **3:** 2–3 file types
 - **5:** >3 file types
 
-**Assessment:** Count unique file formats and media types (TSV, Parquet, JSON, DICOM, etc.). Variety indicates multi-modal data.
+**Assessment:** Count unique file formats and media types (TSV, Parquet, JSON, DICOM, etc.). Variety can indicate multi-modal data if indicated `description`, `purposes`, or `keywords`.
 
 ---
 
@@ -161,7 +161,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 - **Pass:** Numeric file size or instance count found
 - **Fail:** No file size/instance metadata
 
-**Assessment:** Look for bytes field, instance counts, or sample size documentation.
+**Assessment:** Look for bytes field, instance counts, or sample size documentation. Note that sample size only enables and estimate of the file size.
 
 ---
 
@@ -204,9 +204,9 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 - **3:** Basic ethics (IRB + deidentification)
 - **5:** Comprehensive (all human subjects protections documented)
 
-**Assessment:** Evaluate comprehensiveness of ethical documentation across all protection areas.
+**Assessment:** Evaluate comprehensiveness of ethical documentation across all protection areas
 
-**Applies to:** Bridge2AI-Voice, AI-READI
+**Applies to:** Always report results of this question, but only score if human subjects or governance restrictions are identified elsewhere.
 
 ---
 
@@ -220,9 +220,9 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 - **3:** License only
 - **5:** License + restrictions + confidentiality classification
 
-**Assessment:** Evaluate clarity and completeness of governance and access documentation.
+**Assessment:** Evaluate clarity and completeness of governance and terms of use documentation.
 
-**Applies to:** Bridge2AI-Voice, Dataverse
+**Applies to:** Always report results of this question, but only score if human subjects or governance restrictions are identified elsewhere.
 
 ---
 
@@ -238,7 +238,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Check for standard formats (Parquet, TSV, OMOP, FHIR, DICOM), encoding, and schema conformance references.
 
-**Applies to:** Bridge2AI-Voice, Health Nexus
+**Applies to:** Always report results of this question, but only score if datasets were identified elsewhere as shared and available for reuse.
 
 ---
 
@@ -256,7 +256,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Look for strategy documentation and software names, versions, and links.
 
-**Applies to:** Bridge2AI-Voice
+**Applies to:** Always report results of this question, but only score if software tools were identified elsewhere as shared and available for reuse.
 
 ---
 
@@ -272,7 +272,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Evaluate detail level and completeness of collection protocol documentation.
 
-**Applies to:** Bridge2AI-Voice, AI-READI
+**Applies to:** Always report results of this question, but only score if data collection was identified elsewhere and datasets were shared and available for reuse.
 
 ---
 
@@ -288,7 +288,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Evaluate completeness of version tracking infrastructure.
 
-**Applies to:** Bridge2AI-Voice, Dataverse
+**Applies to:** Always report results of this question, but only score if data collection was identified elsewhere and datasets were shared and available for reuse.
 
 ---
 
@@ -304,7 +304,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Count publications, external resources, and check for formal dataset citation.
 
-**Applies to:** Bridge2AI-Voice, AI-READI
+**Applies to:** Always report results of this question, but only score if publication was identified elsewhere and datasets were shared and available for reuse.
 
 ---
 
@@ -320,7 +320,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Evaluate demographic detail and population characterization through instances and subpopulations.
 
-**Applies to:** Bridge2AI-Voice, AI-READI
+**Applies to:** Always report results of this question, but only score if human subjects or governance restrictions are identified elsewhere.
 
 ---
 
@@ -335,7 +335,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 - **Pass:** At least one working external URL present
 - **Fail:** No external links found
 
-**Assessment:** Verify presence of persistent URLs.
+**Assessment:** Verify presence of persistent URLs. 
 
 ---
 
@@ -351,7 +351,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Evaluate clarity of access instructions through distribution formats and licensing.
 
-**Applies to:** Dataverse, PhysioNet
+**Applies to:** Always report results of this question, but only score if datasets were identified elsewhere as shared and available for reuse.
 
 ---
 
@@ -394,7 +394,7 @@ Read the provided D4D YAML file and perform a **semantic quality assessment** th
 
 **Assessment:** Look for external resources linking to related platforms (FAIRhub, PhysioNet, GitHub, etc.).
 
-**Applies to:** Health Nexus, PhysioNet, FAIRhub
+**Applies to:** Always report results of this question, but only score if datasets were identified elsewhere as shared and available for reuse.
 
 ---
 
@@ -729,7 +729,7 @@ semantic_analysis_summary:
 
 2. **Evidence-Based Scoring:** Include specific field values and quotes.
 
-3. **Context-Aware:** Some questions apply only to specific dataset types (see "applies_to" field).
+3. **Context-Aware:** Some questions apply only to specific dataset and program types (see "Applies to" field in questions).
 
 4. **Graduated Scoring:** Use the full 0-5 range for numeric questions based on quality levels.
 
@@ -753,9 +753,9 @@ semantic_analysis_summary:
 **User:** "Run rubric20 assessment on CM4AI D4D files (curated, gpt5, claudecode)"
 
 **Agent:**
-1. Evaluates each file separately
-2. Generates detailed quality assessments
-3. Highlights differences in FAIR compliance and technical documentation
+1. Evaluates each file separately and generates detailed quality assessments, following the procedure in Example 1
+2. Compare and contrast content and scoring between files
+3. Report summary of comparison between files
 
 ## How This Agent Works