TAMR+

Trust-Aware Multi-Signal Document Retrieval
Graph-Based Compliance Scoring | Gap Attribution | EU AI Act Ready

Standard RAG fails for regulatory AI. Vector similarity treats "shall ensure compliance" and "may consider compliance" as identical. Scores are opaque. Knowledge is ephemeral. There's no audit trail.

TAMR+ fixes this. A three-stage pipeline where 65% of retrieval scoring comes from structural signals, not vector similarity. Every score is explained. Every gap is attributed. Every response is auditable.

What's Inside

Component	Description	Status
TRACE Scoring	5-dimension compliance scoring mapped to EU AI Act articles	Spec + Formulas
Gap Attribution	5-category taxonomy decomposing score gaps into actionable causes	Spec + Examples
EU-RegQA-100	100 regulatory questions across 5 difficulty tiers	Benchmark
MedRegQA-50	50 medical device regulation questions	Benchmark
FinRegQA-50	50 financial services regulation questions	Benchmark
CrimNet-50	50 law enforcement regulation questions	Benchmark
HashGNN	Training-free graph embeddings via MinHash (pure NumPy)	Reference Impl
Paper	arXiv preprint (v2.3)	PDF + LaTeX

Key Results

Pipeline: 207ms avg latency | $0.03/workspace | Zero LLM calls during retrieval

System	EU-RegQA	MedRegQA	FinRegQA	CrimNet	Avg
TAMR+ v2.3 (3-hop)	0.74	0.69	0.66	0.63	0.680
TAMR+ v2.3 (1-hop)	0.67	0.63	0.61	0.59	0.625
GraphCompliance	0.554	---	---	---	0.554
Vector-only RAG	0.41	0.38	0.39	0.36	0.385

Ablation: Removing any single component degrades performance by 6-27%. Vector-only scores 38.8% below the full pipeline (p<0.001).

Architecture

Query
  |
  v
[Stage 1: Document Manifest Selector]     ~10ms, zero LLM
  | 5 deterministic signals
  v
[Stage 2: Multi-Phase Retrieval]           ~275ms
  | P1: Vector ANN (35%)
  | P2: KG Alignment (30%)
  | P3: Causal Density (10%)
  | P4: Marginal Selection (15%)
  | P5: SHA-256 Lineage (-10% redundancy)
  v
[Stage 3: TRACE + Gap Attribution]
  | T: Transparency (Art. 13, 50)
  | R: Reasoning (Art. 15)
  | A: Auditability (Art. 51)
  | C: Compliance (Art. 9, 14, 26)
  | E: Explainability (Art. 13)
  v
Score + Gap Attribution + Confidence Tier

TRACE Scoring

Every response gets a deterministic score mapped to EU AI Act articles:

TRACE = (T + R + A + C + E) / 5  # Each dimension in [0, 1]

Tier	Score	Meaning	Action
GREEN	>= 0.76	Very high	Autonomous
BLUE	0.66-0.75	High	Optional review
YELLOW	0.50-0.65	Moderate	Mandatory review
RED	0.20-0.49	Low	Expert review
GRAY	< 0.20	Insufficient	Blocked

The gap is the feature: A 67% score with full gap attribution (SCG 42%, PKC 28%, DLT 8%, ADG 12%, FSC 10%) tells you exactly what to fix.

Technical Innovations (6 Groups, 18 Patent Claims)

Link Prediction for Gap Detection — Graph-based regulatory gap prediction (AUC-ROC 0.847)
Multi-Signal Scoring — 65% structural signals, ablation-validated
HashGNN Embeddings — 128-dim via MinHash, no GPU, no training, pure NumPy
Cross-Domain Benchmarks — 250 questions across 4 regulatory domains
Multi-Hop Traversal — Decay-weighted scoring, entity coverage 63.6% to 84.1%
Cypher-Native GraphRAG — Single-query vector + graph retrieval

Quick Start

Use the Benchmarks

import json

# Load EU-RegQA-100
with open("benchmarks/eu-regqa-100/eu_regqa_100.json") as f:
    questions = json.load(f)

# Evaluate your RAG system
for q in questions:
    response = your_system.query(q["question"])
    # Score against ground truth using TRACE methodology

Implement TRACE Scoring

See trace-scoring/spec.md for the complete specification. All formulas are deterministic and can be implemented in any language.

Run HashGNN

from hashgnn import HashGNN

model = HashGNN(dim=128, metapaths=4, rounds=3)
embeddings = model.fit_transform(knowledge_graph)
# embeddings: {node_id: np.array([0,1,0,1,...], dtype=bool)}

Honest Disclosure

We report production scores (60-74%) alongside the 97% theoretical ceiling. The 20+ percentage point gap is not hidden but analyzed, explained, and attributed:

Source Coverage Gap (42%): Small workspace (4 docs). Fix: add more documents.
Parametric Knowledge Cost (28%): LLM fills gaps. Fix: domain-specific sources.
Domain Language Tax (8%): Regulatory vocabulary. Fix: glossary expansion.
Attribution Density Gap (12%): Formatting over evidence. Fix: citation improvements.
Structural Ceiling (10%): Irreducible (3% system-wide). Disclosed per Art. 13.

What's NOT Included (Proprietary)

This repo contains research artifacts and a reference implementation. The production TAMR+ system (tracegov.ai) includes proprietary components not released here:

Production pipeline source code and deployment infrastructure
Neo4j Cypher query templates and graph schema
Regex-based document classification rules
Causal density computation internals
SHA-256 lineage chain implementation
Tier routing and escalation logic

See NOTICE for full details. Methods are protected by European Patent Applications EP26162901.8 and EP26166054.2.

Comparison with Existing Frameworks

Feature	TAMR+	RAGAS	DeepEval	COMPL-AI	GraphCompliance
Gap attribution	5 categories	No	No	No	No
Predictive gaps	Yes	No	No	No	No
Formula-based (no ML)	Yes	No	No	Partial	Partial
EU AI Act mapping	8/8 articles	0/8	0/8	3/8	0/8
Cross-domain	4 domains	N/A	N/A	1	1
Audit trail	Yes (Art. 51)	No	No	No	No
Production deployed	Yes	N/A	N/A	No	No

Citation

@article{kumar2026tamrplus,
  title={TAMR+: Trust-Aware Multi-Signal Document Retrieval with
         Graph-Based Compliance Scoring and Gap Attribution
         for Regulatory AI Systems},
  author={Kumar, Harish},
  journal={SSRN Electronic Journal},
  year={2026},
  note={European Patent Applications EP26162901.8 and EP26166054.2}
}

Links

Paper: SSRN 6359818
Production: tracegov.ai
Patents: EP26162901.8 (filed 2026-03-06) and EP26166054.2 (filed 2026-03-19)
Company: Quantamix Solutions B.V.

License

Apache 2.0 — See LICENSE.

The methods are covered by European Patent Applications EP26162901.8 and EP26166054.2. The Apache 2.0 license includes a patent grant for use of the open-source components.

Built by Quantamix Solutions B.V. | Uithoorn, The Netherlands

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
benchmarks		benchmarks
docs		docs
gap-attribution		gap-attribution
hashgnn		hashgnn
trace-scoring		trace-scoring
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TAMR+

What's Inside

Key Results

Architecture

TRACE Scoring

Technical Innovations (6 Groups, 18 Patent Claims)

Quick Start

Use the Benchmarks

Implement TRACE Scoring

Run HashGNN

Honest Disclosure

What's NOT Included (Proprietary)

Comparison with Existing Frameworks

Citation

Links

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TAMR+

What's Inside

Key Results

Architecture

TRACE Scoring

Technical Innovations (6 Groups, 18 Patent Claims)

Quick Start

Use the Benchmarks

Implement TRACE Scoring

Run HashGNN

Honest Disclosure

What's NOT Included (Proprietary)

Comparison with Existing Frameworks

Citation

Links

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages