Skip to content
View dev404ai's full-sized avatar

Highlights

  • Pro

Block or report dev404ai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dev404ai/README.md

Oleg Solozobov

Production AI Reliability & Operational Evidence · OEP | Agent Runtime · Observability · Evals · Replay

Platform artifacts for production AI and agent runtime systems: workflow orchestration, tool-call permission and identity records, agent-step telemetry, release manifests, eval traces, replay / reconstruction packets, rollout gates, and incident evidence.

Author of the Operational Evidence Plane (OEP) for Agentic AI - open reference architecture for the operational-evidence layer of agent runtime systems. v0.3.0 joins release manifests, runtime events, permission records, traces, evals, replay state, and reconstruction packets under stable decision_id values, with counterfactual replay across policy, cost, drift, cache, and identity metadata. Concept DOI 10.5281/zenodo.20051036; v0.3.0 archive 10.5281/zenodo.20363793.

Method spec: Decision Evidence Maturity Model (DEMM) - arXiv:2605.04093. Empirical pilot: arXiv:2605.12078.

PhD research: the Operational Evidence Plane and counterfactual replay for production AI and agent runtime systems.


Start Here


Research Preprints

Agentic AI / DEMM:

Operational evidence foundation:


Focus Areas

  • Production AI Reliability (release manifests, eval-to-release gates, incident evidence)
  • Agent Runtime / Workflow Infrastructure (tool-use workflows, state / replay, safe execution evidence)
  • Observability & Evals Infrastructure (eval / telemetry linkage, traces, quality loops)
  • Operational Evidence & Incident Reconstruction (event identity, lineage, reconstruction packets)
  • Agent Permissions / Identity / Policy Controls (tool-call authorization, policy lifecycle, agent-to-service evidence)
  • Release Gates & Reliability Engineering (canary, shadow, rollback, postmortem packets)
  • Platform / Control Plane Engineering (distributed services, Kubernetes, GitOps, multi-cloud)
  • Data & Streaming Infrastructure (events, schemas, evidence joins, delayed-label systems)

Selected Public Proof

Current strongest public proof:

  • operational-evidence-plane - v0.3.0 public reference implementation for production AI / agent-runtime operational evidence: release manifests, agent-step events, tool-call permission packets, operational traces, eval results, reconstruction packets, deterministic code-review demo, Bedrock translation, and counterfactual replay across policy / cost / drift / cache / identity metadata. Apache-2.0. Concept DOI: 10.5281/zenodo.20051036; v0.3.0 DOI: 10.5281/zenodo.20363793.
  • decision-trace-reconstructor - v0.1.0 trace reconstruction tool that reports evidenced, partial, absent, and opaque decision facts across LangSmith, OpenTelemetry, Bedrock, OpenAI Agents, Anthropic, MCP, and other adapters. Zenodo DOI: 10.5281/zenodo.19851574.

Foundational operational-evidence artifacts:

Supporting policy-as-code project:

  • RuleHub - supporting Policy-as-Code ecosystem for AI / ML guardrails, policy enforcement, and reproducible evidence; currently secondary to OEP and used as a policy / agent-runtime bridge rather than the lead artifact.

Agent Runtime & Operational Evidence

Python Go OPA/Rego MCP OpenTelemetry JSON Schema gRPC SQLite


Cloud & Platform Engineering

Kubernetes Terraform Helm Argo CD Istio Envoy AWS GCP Azure


Data, Streaming & Evidence Joins

Kafka Flink ClickHouse Redis PostgreSQL


Observability, Evals & Reliability

Prometheus Thanos Grafana Loki Tempo Sentry Evals Replay


Policy, Identity & Safeguards

OAuth2/OIDC JWT Vault Kyverno Trivy Semgrep Checkov CodeQL

Pinned Loading

  1. rulehub/rulehub rulehub/rulehub Public

    Policy-as-Code guardrails for ML and LLM systems: OPA/Kyverno policies, compliance mappings, signed bundles, evidence trails, and plugin index.

    Python 5 1

  2. costscope/costscope costscope/costscope Public

    Open FinOps and governance data plane for FOCUS 1.2 cost normalization across cloud, on-prem, GPU, and AI/LLM workloads.

    Go 1 1

  3. governance-evidence/decision-event-schema governance-evidence/decision-event-schema Public

    JSON Schema for decision events as governance evidence units in automated decision and real-time risk systems. MIT.

    Python 1 1