Skip to content

[Hackathon] chainaim-nisha : Outcome-Verified Settlement: per-tick payments gated by delivery, integrity, and conformance truly allowing for outcome versus tokens#61

Open
chainaim-nisha wants to merge 1 commit into
projnanda:mainfrom
chainaim-nisha:hackathon/settlement-protocol-standards-eng-outcome-verified-settlement
Open

[Hackathon] chainaim-nisha : Outcome-Verified Settlement: per-tick payments gated by delivery, integrity, and conformance truly allowing for outcome versus tokens#61
chainaim-nisha wants to merge 1 commit into
projnanda:mainfrom
chainaim-nisha:hackathon/settlement-protocol-standards-eng-outcome-verified-settlement

Conversation

@chainaim-nisha

Copy link
Copy Markdown

Outcome-Verified Settlement: per-tick payments gated by delivery, integrity, and conformance truly allowing for outcome versus tokens

Problem: 03 — Streaming pay-per-second payments with mid-stream cancellation
Layer: payments
Registered as: ("payments", "outcome_verified_settlement")

Naming note: the spec's suggested key ("payments", "streaming") is already held in
_BUILTINS by merged PR #21 (and the spec's scenario filename,
scenarios/streaming_payments.yaml, is likewise already occupied by that PR's shipped
scenarios). Reusing it was impossible without either modifying that
merged plugin (out of scope under the charter's additive-only rule) or silently
clobbering its registration. So: a distinct key for a distinct invariant —
billed ≤ rate × verified units, not billed while the stream is open. This plugin
complements PR #21; it does not compete with it (the default gate reproduces its
delivery-gated billing byte-for-byte).

Persona: settlement-protocol-standards-eng
Branch: hackathon/settlement-protocol-standards-eng-outcome-verified-settlement

Visual explaination of PR Solution

https://incredible-baklava-420ffb.netlify.app/

Motivation

The spec's own frame for Problem 03 is time-metered billing: bill while the stream
is open, stop billing on close or partition. That's a real requirement and this PR
satisfies it. But time-metered billing only answers "did the clock tick" — it says
nothing about whether the metered unit was actually delivered, delivered intact, or
delivered correct. This PR adds a pluggable settlement gate so a tick can be
billed on any of three cumulative disciplines:

  • the default gate reproduces today's delivery-gated billing byte-for-byte (a tick
    bills once the seller acks it), so it's a drop-in, not a rewrite;
  • an opt-in integrity gate additionally requires the delivered bytes to match a
    declared checksum before a tick is billed, catching corruption in transit;
  • an opt-in conformance gate additionally requires the delivered content to match
    the buyer's committed acceptance criterion for that specific unit — catching a
    seller that delivers a different, real unit's content with an honest checksum
    of what it actually sent. Integrity alone cannot see this: the checksum is
    honest, the content is just wrong.

The name outcome_verified_settlement is now literally backed by code: an L3 pass
means delivered AND intact AND conforming, not merely "the seller's own checksum
claim was self-consistent."

The verification ladder

Level Gate Verifies Blind to Status
L0 none (tick-based) clock advanced everything not shipped — the naive incumbent model this PR is not
L1 AckReceivedGate (default) something arrived acked corrupt/wrong content shipped
L2 ChecksumGate (gate: checksum) the declared bytes arrived intact honestly-checksummed but wrong-unit content shipped
L3 EvaluatorGate (gate: evaluator, criterion: reference_match) delivered content matches THIS unit's committed reference, on top of L1+L2 subjective "is this good," anything beyond the committed criterion shipped

Each rung catches exactly one failure class the rung below is blind to.
EvaluatorGate composes ChecksumGate internally (require_integrity=True
default) and short-circuits on an integrity failure before the criterion ever runs.
The proof that L3 is not redundant with L2 — an honestly-checksummed reply to the
wrong unit that passes integrity and fails conformance — is isolated in
test_outcome_verified_settlement_b5_checksum_passes_criterion_fails and re-proven
through the real driver in
test_outcome_verified_settlement_b6_nonconform_checksum_is_honest_at_failing_seq.

json_schema and artifact_match are two additional, real, independently
unit-tested criteria (test_outcome_verified_settlement_b7_criteria.py) —
deliberately not yet wired through Gate.from_name/scenario YAML, since they need
parameters (required_fields, expected_sha256, etc.) the wire-level config
doesn't carry in this iteration. reference_match is the only criterion reachable
from a scenario today; criterion_hash-on-wire is roadmap, not this PR.

What ships

Spec/rubric criterion Location
Plugin registered ("payments","outcome_verified_settlement") nest_core/plugins.py _BUILTINS + pyproject.toml entry point nest.plugins.payments
open_stream(to, rate_per_tick, max_total, ref) -> StreamHandle outcome_verified_settlement.py::OutcomeVerifiedSettlement.open_stream
close_stream(ref) -> Receipt, deterministic, any tick, unused remainder never spent outcome_verified_settlement.py::OutcomeVerifiedSettlement.close_stream
Drain one tick at a time, capped at max_total outcome_verified_settlement.py::OutcomeVerifiedSettlement.advance
pay/refund (existing Payments protocol) still work pay == a one-tick stream that drains the full amount
Settlement gate seam (L1/L2/L3) scenarios_builtin/chainaim/gates.pyGate.from_name, AckReceivedGate, ChecksumGate, EvaluatorGate
Criterion library gates.pyreference_match (wired), json_schema/artifact_match (unit-tested, not wired)
Scenario driver + trace grammar scenarios_builtin/chainaim/outcome_verified_settlement.py
Adversarial validators (4 — spec requires 2) nest_core/chainaim/outcome_verified_settlement_validator.py
Scenarios (7): base + 5 controls across L1/L2/L3 + rolling streams scenarios/outcome_verified_settlement{,_overbill,_degrade,_degrade_billbug,_nonconforming,_nonconforming_billbug,_rolling}.yaml

Threat model

# Attack Invariant that defeats it Caught by
1 Drain-after-close — plugin keeps debiting after the stream closed cumulative drain never exceeds max_total; no metered tick after the close tick; every debited unit is credited in the same step (conservation), enforced atomically in advance validate_outcome_verified_settlement_no_drain_after_close; conservation is proven by a hypothesis property test (the trace carries no balances, so per-tick debit==credit is proved at the plugin boundary, not re-derived from the trace)
2 Over-bill on partition — payer is partitioned mid-stream; plugin keeps billing for ticks the payee never received drained ≤ rate × acks received validate_outcome_verified_settlement_no_overbill; the bill_on_send: true variant (_overbill.yaml) is the runnable negative control
3 Over-bill on failed verification — seller delivers corrupt (L2) or nonconforming (L3) content; plugin bills anyway drained ≤ rate × pass-verdicts (content gate only) validate_outcome_verified_settlement_no_overbill_on_failed_verification; _degrade_billbug.yaml (L2) and _nonconforming_billbug.yaml (L3) are the runnable negative controls
4 (beyond spec) Dishonest verdict — a gate implementation reports pass despite a checksum that doesn't actually match a gate:pass requires the recomputed checksum to match the declared one, re-derived independently from the trace, never trusting the gate's own claim validate_outcome_verified_settlement_verdicts_match_committed_criterion. Scope, stated not hidden: integrity-honesty only, one-directional — a legitimate L3 fail is never flagged, only a dishonest pass is, because the trace doesn't yet commit which criterion was configured (roadmap: criterion_hash on the wire)

Every validator reconciles from the trace, never from the plugin's own accounting —
the threat modeled is precisely a plugin whose internal accounting is wrong.

Trace grammar

stream-open:<ref>:<payer>:<payee>:<rate>:<max_total>:<opened_tick>
tick:<ref>:<seq>:<rate>:<now_tick>                        buyer -> seller
ack:<ref>:<seq>                                           seller -> buyer (L1 delivery gate)
ack:<ref>:<seq>:<chunk_hex>:<declared_checksum>           seller -> buyer (L2/L3 content gate)
gate:<ref>:<seq>:pass|fail                                content gate only
stream-close:<ref>:<seq>:<drained>:<close_tick>:<reason>

Unchanged since L1+L2: L3 (nonconforming) units reuse this exact grammar — no new
trace lines were needed for conformance-gating; only the bytes the seller sends
differ, not the message shape. The default (L1) path still emits a trace
byte-identical to the pre-gate scenario.

Design tradeoff: no custody, unit-capped credit risk instead

This PR moves funds buyer→seller directly per verified tick — no escrow account, no
locked capital, no arbiter. The tradeoff: the seller carries per-unit credit risk
(the buyer could be insolvent at the moment of advance()), which an escrow model
would eliminate via a solvency guarantee. This PR's position: that risk is detected
on the very next tick and capped at one unit's rate, which is a better trade than
locking working capital for the full stream duration — for small, frequently-verified
units. For large, one-shot, subjective transfers, escrow is the right tool; this PR
doesn't compete there.

How is this solution PR unique and different from other PR's and addresses industry need for pay for outcome versus tokens ,hence not a duplicate

The spec's suggested key, ("payments","streaming"), is already registered by a
merged upstream PR (#21) that bills on the clock
while a stream is open. This PR registers a distinct key, outcome_verified_settlement,
because the contribution is a different invariant — billed ≤ rate × verified units,
not billed while stream is open — and the default gate reproduces the merged plugin's
delivery-gated behavior byte-for-byte, so this is a drop-in verification upgrade,
not a competing reimplementation. The open escrow submissions (#7 HTLC, #38
arbitrated escrow, #41 EMPIC evidence-gated escrow) occupy the custody/arbitration
design space — funds locked up front, released on acceptance or arbitration; this
PR deliberately occupies the opposite corner: no custody, direct per-unit
settlement, each unit verified before it bills thus making it more practical for agentic delivery Versus payments (see the design-tradeoff section
for when each model is the right tool).

PR #21("payments","streaming") This PR — ("payments","outcome_verified_settlement")
Invariant billed while the stream is open (clock/delivery-metered) billed ≤ rate × verified units (outcome-metered)
Gate seam none — settlement is implicit in the stream lifecycle pluggable Gate seam: L1 delivery / L2 integrity / L3 conformance
Validators upstream streaming lifecycle validators 4 adversarial trace-only validators, incl. verified-prefix billing and verdict honesty
Relationship baseline — correct by its own spec (see ..._b11_invariant_not_vacuous.py) complements, not competes: the default gate reproduces #21's delivery-gated billing byte-for-byte

Verification

uv sync
uv run ruff check .
uv run ruff format --check .
uv run pyright
uv run pytest -v   # full workspace, no path filter

Observed on the exact branch under review (full workspace, no path filter):
839 passed, 1 skipped, 1 deselected, 0 failed; ruff check clean ("All checks
passed!"); ruff format --check clean (187 files already formatted); pyright
0 errors / 0 warnings / 0 informations; uv sync resolved 89 packages clean.

# L1/L2 (unchanged from prior sessions)
uv run nest run scenarios/outcome_verified_settlement.yaml --seed 42
uv run nest run scenarios/outcome_verified_settlement_overbill.yaml --seed 42          # no_overbill FAILS
uv run nest run scenarios/outcome_verified_settlement_degrade.yaml --seed 42
uv run nest run scenarios/outcome_verified_settlement_degrade_billbug.yaml --seed 42   # no_overbill_on_failed_verification FAILS

# L3 (new this iteration)
uv run nest run scenarios/outcome_verified_settlement_nonconforming.yaml --seed 42
uv run nest run scenarios/outcome_verified_settlement_nonconforming_billbug.yaml --seed 42  # no_overbill_on_failed_verification FAILS

# Rolling streams (streams_per_buyer: 3 -- each buyer rolls 3 consecutive L3-gated streams)
uv run nest run scenarios/outcome_verified_settlement_rolling.yaml --seed 42

Rolling streams are covered end-to-end in
test_outcome_verified_settlement_b10_rolling.py (unique per-cycle refs, 4/4
validators on the rolling trace, per-cycle cap independence, failed-verdict-closes-
cycle-next-still-opens, a partitioned buyer that opens once, never rolls, and
bills nothing, same-seed determinism, and a regression that the base scenario
stays on the five legacy refs). The comparative discipline proof lives in
test_outcome_verified_settlement_b11_invariant_not_vacuous.py: the identical
delivered-unit sequence passes all four validators under outcome-verified billing
and fails exactly no_overbill_on_failed_verification under clock/delivery
billing — the baseline is correct by its own spec (it satisfies no_overbill
exactly), so the new invariant is discriminating, not vacuous.

Extension in phase2

In phase2, the solution will be extended with more real-world scenorios and usecases .

Visual explaination of PR Solution

https://incredible-baklava-420ffb.netlify.app/

@shilpa-kulkarni-14

Copy link
Copy Markdown

Thanks for this, the engineering is strong. Verified locally on the PR head
(8fbfec0, already on current main):

  • ci-local 5/5 green (ruff, format, pyright 0 errors, pytest 839 passed); chainaim
    bank 103 passed.
  • Discriminator confirmed and honest (b11): on the identical unit sequence with an
    identical stream-close, the clock-billed baseline (streaming, PR feat(payments): streaming per-tick payments with mid-stream cancellation #21) fails ONLY
    the new no_overbill_on_failed_verification invariant while outcome-verified passes.
  • Determinism across seeds 42/7/1337: byte-identical on repeat, distinct per seed.

Holding merge on scope, not correctness. This adds a chainaim/ vendor namespace
across nest-core, the reference package, tests, and scripts (24 of 37 files), and
puts validator logic in a vendor module rather than inline in validators.py like the
other layers. Merging as-is sets a precedent for company-namespaced subtrees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants