Proposal: Knowledge Maturity Model — combining confidence, evidence, lifecycle, and verification gates
This proposal builds on several active discussions:
Each covers one dimension. What's missing is a unified model that connects them: a claim's reliability depends on where it sits in a maturity lifecycle, not just a static score.
The gap
An OKF bundle today can express See [customers](/tables/customers.md). #148 adds rel:
links:
- target: /tables/customers.md
rel: depends_on
But an agent still can't ask:
- Is this
depends_on verified against the live system, or inferred from documentation?
- How many independent sources support it?
- Should I trust this claim enough to act on it?
We need a maturity model, not just metadata fields.
Proposal: Knowledge Maturity Model
EVIDENCE (citations + provenance)
│
CONFIDENCE (epistemic score)
│
┌────┴────┐
proposed ────────► active ──► expired
│
▼
low_quality / noise
Axis 1: Lifecycle (proposed → active → expired)
| Status |
Meaning |
Example |
| proposed |
Extracted, not yet verified |
LLM-extracted candidate for review |
| active |
Passed verification gate |
Verified claim in production |
| expired |
Superseded or invalidated |
Outdated, kept for audit trail |
The lifecycle is not monotonic. active can roll back to proposed if regression tests fail. proposed is periodically GC'd if it never gains enough evidence.
Axis 2: Confidence (epistemic reliability)
Floating-point [0.0, 1.0] — finer than categorical, needed by downstream agents for ranking:
| Range |
Meaning |
| ≥ 0.95 |
keyword_list match / manually confirmed |
| ≥ 0.75 |
LLM extraction with multi-document consensus |
| ≥ 0.50 |
Single-document LLM extraction |
| < 0.50 |
regex pattern match (high noise, treat as suggestion) |
Axis 3: Evidence (per-edge provenance)
Not just IMPORTED FROM X, but which document, which sentence, which quote:
evidence:
- document: /tech/some-doc.md
quote: "system X uses technology Y for data processing"
confidence: 0.95
- document: /business/client-interview.md
quote: "confirmed dependency in architecture review"
confidence: 0.85
Production reference implementation
Running in production on an enterprise knowledge base:
| Metric |
Count |
| Knowledge concepts (OKF concepts) |
320 |
| Vector chunks (768d) |
1,354 |
| Knowledge graph nodes |
~207 |
| Knowledge graph edges |
330+ (proposed + active + expired) |
| Edge predicates |
8 types (uses, adopts, regulates, competes, implements, depends_on, part_of, validates) |
| Pipeline stages |
5 (extract → denoise → backfill → regress → promote) |
| Regression gate |
Node recall ≥ 0.95, Edge recall ≥ 0.75 |
Core schema (SQLite)
CREATE TABLE knowledge_nodes (
id TEXT PRIMARY KEY,
name TEXT NOT NULL,
node_type TEXT NOT NULL, -- Client | Technology | Concept | Project
description TEXT,
properties TEXT,
created_at TEXT,
updated_at TEXT
);
CREATE TABLE knowledge_edges (
id TEXT PRIMARY KEY,
source_id TEXT NOT NULL REFERENCES knowledge_nodes(id),
target_id TEXT NOT NULL REFERENCES knowledge_nodes(id),
predicate TEXT NOT NULL,
confidence REAL DEFAULT 0.0,
edge_status TEXT DEFAULT 'proposed', -- proposed | active | expired
extraction_source TEXT,
description TEXT,
evidence_summary TEXT,
properties TEXT,
promotion_reason TEXT,
created_at TEXT,
updated_at TEXT,
expired_at TEXT,
valid_to TEXT,
CHECK(edge_status IN ('proposed', 'active', 'expired'))
);
CREATE TABLE edge_evidence_links (
id TEXT PRIMARY KEY,
target_id TEXT NOT NULL REFERENCES knowledge_edges(id),
document_id TEXT,
quote TEXT,
confidence REAL DEFAULT 0.0,
created_at TEXT
);
Pipeline flow
watch_brain (file watcher)
│
▼
Stage 1 — extract: LLM extracts entities + relationships → stored as proposed
│
▼
Stage 2 — denoise: Rust/Python noise filter → batch-expire low-quality edges
│ (weak subjects, noise patterns, cross-domain isolation)
▼
Stage 3 — backfill: Re-link evidence quotes to proposed edges
│
▼
Stage 4 — regress: Gold Set regression gate
│ node_recall ≥ 0.95 AND edge_recall ≥ 0.75?
│ NO → block pipeline, flag for human review
│ YES → proceed
▼
Stage 5 — promote: proposed → active + write wiki Markdown
│ Only edges with confidence ≥ 0.75, importance ≥ 4,
│ evidence_count ≥ 2
▼
Agent uses active edges; proposed edges are periodically garbage-collected
Proposed frontmatter extension
---
type: Relationship
title: "System X uses Technology Y"
predicate: uses
confidence: 0.92
lifecycle: active # proposed | active | expired
extraction_source: keyword_list # how it was produced
evidence:
- document: "/tech/some-doc.md"
quote: "system X uses technology Y"
confidence: 0.95
verified_by: gold_set_regression # gate type
verified_at: "2026-06-29T10:00:00Z"
---
Backward compatibility
- Bundles without
lifecycle, predicate, or evidence remain valid OKF.
- Consumers that don't understand these fields ignore them.
confidence and lifecycle are semantically separable: a proposed edge can have high confidence, and an active edge can have low confidence (stale verified claim).
Questions for maintainers
- Would OKF v0.2 / v1.0 consider adding an optional
lifecycle field to concept frontmatter, analogous to edge_status in graph stores?
- Should
evidence be a spec-level concept (like # Citations), or remain a producer extension?
- Is there interest in a reference implementation of the pipeline as a companion tool to the Enrichment Agent?
okf-knowledge-maturity-model-issue.md
Proposal: Knowledge Maturity Model — combining confidence, evidence, lifecycle, and verification gates
This proposal builds on several active discussions:
rel, but doesn't address how an edge transitions from draft to published to deprecated.Each covers one dimension. What's missing is a unified model that connects them: a claim's reliability depends on where it sits in a maturity lifecycle, not just a static score.
The gap
An OKF bundle today can express
See [customers](/tables/customers.md).#148 addsrel:But an agent still can't ask:
depends_onverified against the live system, or inferred from documentation?We need a maturity model, not just metadata fields.
Proposal: Knowledge Maturity Model
Axis 1: Lifecycle (
proposed → active → expired)The lifecycle is not monotonic.
activecan roll back toproposedif regression tests fail.proposedis periodically GC'd if it never gains enough evidence.Axis 2: Confidence (epistemic reliability)
Floating-point
[0.0, 1.0]— finer than categorical, needed by downstream agents for ranking:Axis 3: Evidence (per-edge provenance)
Not just
IMPORTED FROM X, but which document, which sentence, which quote:Production reference implementation
Running in production on an enterprise knowledge base:
Core schema (SQLite)
Pipeline flow
Proposed frontmatter extension
Backward compatibility
lifecycle,predicate, orevidenceremain valid OKF.confidenceandlifecycleare semantically separable: aproposededge can have high confidence, and anactiveedge can have low confidence (stale verified claim).Questions for maintainers
lifecyclefield to concept frontmatter, analogous toedge_statusin graph stores?evidencebe a spec-level concept (like# Citations), or remain a producer extension?okf-knowledge-maturity-model-issue.md