Skip to content

[Hackathon] srcJin: Problem #10 — ResonanceBFT, pentadic partition-tolerant BFT consensus for social agent networks#58

Open
srcJin wants to merge 1 commit into
projnanda:mainfrom
srcJin:hackathon/srcJin-resonance-bft
Open

[Hackathon] srcJin: Problem #10 — ResonanceBFT, pentadic partition-tolerant BFT consensus for social agent networks#58
srcJin wants to merge 1 commit into
projnanda:mainfrom
srcJin:hackathon/srcJin-resonance-bft

Conversation

@srcJin

@srcJin srcJin commented Jul 2, 2026

Copy link
Copy Markdown

ResonanceBFT — Pentadic BFT Consensus for Social Agent Networks

Layer: coordination
Plugin name: resonance_bft
Problem: #10 — Partition-tolerant BFT consensus
Persona: cognitive-social-computing researcher — someone who came to distributed consensus from multi-agent social simulation, not from distributed databases. The question "did we agree?" is not a bit-level question when the agents are social and goal-directed; it is simultaneously semantic, affective, relational, epistemic, and behavioral.

How to read this. The document goes shallow-to-deep: Overview (the shape and the one load-bearing invariant) → The commit certificate (the L1 safety core) → Deliberation, authenticity & robustness (the L2/L3 mechanisms) → Evidence & verificationAppendix (fine-grained detail, referenced from the body). A new reader can stop after the first two parts and still have the whole thesis.

Contents — outline & navigation
  1. Overview
    TL;DR · Claims → evidence · Motivation · Architecture · How consensus is reached
  2. The commit certificate — L1 safety
    Quorum = n − f · Weighted pentadic quorum · Sealed commitments · Resolver-independent commit + box-validity
  3. Deliberation, authenticity & robustness — L2 / L3
    Deliberation (Hegselmann-Krause) · Trajectory classification · Time-decayed dyadic trust · Byzantine centroid dampening · Cold-start · Sybil guard & view-change · Adaptation, observability & replay
  4. Tradeoffs, evidence & verification
    Tradeoffs · End-to-end through the town runner · Real-LLM town evidence · Verification · Safety proof sketch
  5. Appendix
    A · Package structure · B · Observability report · C · snapshot / restore · D · Three-timescale weights · E · Edge cases · F · Invariants · G · Related systems · H · Academic context

Overview

TL;DR

ResonanceBFT is a partition-tolerant BFT consensus plugin for NANDA Town (problem #10). Agents agree not on a single value but on a five-axis belief vector (semantic · affective · relational · epistemic · behavioral). A deterministic n − f commit core (L1) owns safety; a social-science audit (L2) and three-timescale adaptation (L3) sit on top but never alter the commit. LLM nondeterminism is sealed at participate(), so the commit is a resolver-independent function of the sealed vectors.

Novel: agreement over a value space with box-validity MBAA (deliberately cheaper than convex validity — n ≥ 3f+1 vs 7f+1), cryptographic tamper-evidence + transferable equivocation certificates, and a genuine-vs-superficial consensus audit — classic BFT safety plus social interpretation, with L1 unweakened.

Verified: ruff clean · pyright 0 errors · 321 plugin tests (unit + Hypothesis property + e2e; 1057 whole-repo) · plus real-LLM town evidence — 140 real ScenarioRunner runs across mock / claude-haiku-4-5 / gpt-5.5 / Gemini 3.5 Flash, scaling to n = 49 (commit quorum = n − f), the deterministic core model-agnostic.

Claims → evidence

claim verified by
Safety: any two n−f quorums intersect in an honest node test_bft_quorum_intersection_safety (Hypothesis, n∈[4,12])
The commit is a resolver-independent function of the sealed vectors test_commit_is_resolver_independent_under_divergent_trust
A Byzantine extreme cannot pull the commit (box validity) test_byzantine_extreme_cannot_push_aggregate_out_of_honest_box · test_resists_biased_minority_better_than_plain_mean
No split-brain: a partitioned minority cannot commit TestPartitionQuorumSafety · test_metadata_cannot_lower_configured_membership
Tampered records are detected and excluded seal/signature validator tests + byzantine e2e (tampered=2)
Deliberation converges (opinion diameter contracts) test_deliberation_contracts_opinion_diameter
L2 / L3 never alter the L1 commit certificate commit uses fixed weights; test_learned_axis_weights_are_load_bearing_in_deliberation (L2 only)
The real town commits with real models, at scale tests/test_resonance_bft_e2e.py · EVIDENCE.md / EVIDENCE_LARGE.md (n=4→49, 3 models)

Motivation

Classical BFT (PBFT, HotStuff, Tendermint) defines consensus as identical byte sequences. This is the right definition for a replicated ledger. It is the wrong definition for a social agent network where two agents might submit the same value for incompatible reasons, or where an agent that "agrees" but feels anxious about it will defect under the next adversarial pressure.

contract_net — the default coordination plugin — reduces agreement to "who bids lowest." It has no quorum logic, no view change, and no mechanism to detect whether agents are genuinely aligned or just strategically mimicking the leader's position (sycophancy).

ResonanceBFT upgrades the question. It requires simultaneous convergence across five orthogonal axes:

Axis Captures Vector dim Reference
Semantic What agents believe (BoW TF embedding) 64 distributed consensus literature
Affective How agents feel (valence × arousal) 2 Russell 1980, circumplex model
Relational Who trusts whom (time-decayed dyadic trust) n sociometric literature
Epistemic How certain / how stable (confidence × position_stability) 2 Bayesian epistemology
Behavioral Did they keep their word (integrity × engagement) 2 mechanism design

BFT safety guarantees still hold: n ≥ 3f+1, quorum_needed = n−f. What changes is what counts as a vote: a sealed commitment over the text-derived belief axes, with a weighted pentadic similarity for quorum classification that gives each axis its own fair weight.

Architecture — how the pieces fit

ResonanceBFT architecture: L0 perceive to a sealed 5-axis vector, into the deterministic L1 n-f commit (the only safety certificate), with L2 audit and L3 adapt feeding back only trust

What is novel: value-space consensus and cryptographic integrity (safety-critical) plus a social-interpretation layer that never weakens the L1 commit

The feature list below is not a flat pile; it is four layers over one representation, plus one load-bearing invariant. Classical BFT conflates who agrees (safety) with whether the agreement is genuine (quality — sycophancy, coercion, alliances all fake "agreement"). ResonanceBFT separates them and adds a learning loop:

 L0 REPRESENTATION   five axes: what / feel / trust / sure / integrity   (_vectors.py)
        │            the single state space everything runs on
   ┌────┴───────────────────────────────┐
   ▼                                     ▼
 L1 SAFETY (sacred)                   L2 AUTHENTICITY (the contribution)
   quorum = n − f       ── never ──►  genuine / pressured / alliance / failed
   weighted pentadic       altered    (CONSENSUS_FAMILIES) + sycophancy/evidence_delta
   tamper-evidence seal    by L2
        │ commit certificate                 │ quality label
        └──────────────┬──────────────────────┘
                       ▼
 L3 ADAPTATION (3 timescales, _trust.py): learn the *instruments* from the quality
   labels — round trust/ε · epoch threshold/base-ε · slow axis-weights (EG). Shapes L2 and is
   surfaced as observability; the L1 commit uses FIXED weights by design (resolver-independence).
   Long-horizon by design: dormant in the 5-round graded scenarios; invariants property-tested.

 cross-cutting ROBUSTNESS: Byzantine centroid dampening · cold-start grace · Sybil guard

Load-bearing invariant: L2 (authenticity) and L3 (adaptation) never alter the L1 commit certificate (n−f). That is what lets the protocol add social-science interpretation and self-tuning without weakening any BFT safety guarantee. Every concept below maps to exactly one layer: five axes → L0; Byzantine dampening / cold-start / Sybil → robustness; sycophancy / evidence_delta / trajectory types → L2; three-timescale learning → L3; co-commit ledger → the memory L2 and L3 share. The module docstring (__init__.py) and the seams (resolve = L1, deliberate = L2, record_round_outcome = L3) state their layer in code.

How consensus is reached — one agent, many agents, and the space

Three complementary views of the same convergence: one agent over timemany agents over timemany agents in the embedding space. This is the intuition for everything that follows; the mechanics are in the next two parts.

One agent over twelve rounds through cold-start, converging and consensus phases: opinion converging and trust rising, a slow learned weight warming up in its own strip, and a signed evidence-delta strip

Five honest agents' opinions spread then contract into a shaded consensus funnel to a bullseye over rounds, while a Byzantine agent stays outside and is excluded

Consensus in embedding space: four coloured agents' trajectories converge with velocity arrows into a domain bubble whose bullseye is the commit; a tampered trajectory diverges and is excluded


The commit certificate — L1 safety

This is the sacred core: how a quorum forms, how a vote is sealed, and why every honest node commits identically. Nothing here depends on trust, reputation, or learning.

Quorum formula: quorum_needed = n − f (not 2f+1)

Any two n-minus-f quorums over seven nodes overlap in f+1 nodes, so at least one honest node is shared — no split-brain

The standard formula 2f+1 is only correct when n = 3f+1 exactly.
For n > 3f+1, two quorums of size 2f+1 can intersect in only f agents — all Byzantine.

ResonanceBFT uses quorum_needed = n − f, which gives the correct intersection for all n ≥ 3f+1:

|Q1 ∩ Q2| ≥ (n−f) + (n−f) − n  =  n − 2f  >  f   iff  n > 3f  iff  n ≥ 3f+1  ✓

Verified by the test_bft_quorum_intersection_safety Hypothesis property test.

Weighted pentadic quorum (anti-dimensionality-domination)

One opinion vectorised into five independent belief axes — an irregular radar, the multi-dimensionality a single scalar vote would collapse

Two agents' five-axis vectors compared per-axis by cosine, then combined by the fixed axis weights into one alignment score versus the 0.60 commit threshold

A naive concatenation of all five axis vectors into one combined vector and taking the full-vector cosine gives the semantic axis (64 dims) ≈91% effective weight, rendering the other four axes nearly irrelevant.

ResonanceBFT computes, per axis, each agent's cosine similarity to that axis's centroid, then aggregates with explicit per-axis weights. At commit the centroid is an unweighted mean over non-tampered records (resolver-independent — see the resolver-independence section); the explicit _AXIS_WEIGHTS are what give each axis its fair share despite the 64-vs-2 dimensionality gap:

_AXIS_WEIGHTS = {
    "semantic":   0.25,
    "affective":  0.20,
    "relational": 0.25,
    "epistemic":  0.15,
    "behavioral": 0.15,
}

pentadic_sim[aid] = sum(_AXIS_WEIGHTS[ax] * cosine(rec[ax], axis_centroid[ax])
                        for ax in _AXIS_WEIGHTS)

During deliberate() (L2, not the commit), each axis centroid is additionally weighted by axis-specific trust (_axis_trust[me][other][axis]), so an agent trusted for semantic alignment pulls the semantic centroid more than one trusted only behaviorally. The L1 commit centroid deliberately drops this weighting to stay resolver-independent.

Sealed commitments (tamper-evidence for the belief axes)

Five belief axes plus a nonce hashed with SHA-256 and ed25519-signed; mutating a sealed value makes the digest mismatch, so the record is flagged tampered and excluded

Two-layer integrity (resolve() checks both; either failing ⇒ tampered):

  1. SHA-256 seal sha256(belief ‖ nonce) over all five sealed belief axes — cheap tamper-evidence that catches any party editing a sealed belief without recomputing the seal. For the bag-of-words semantic axis the seal also binds the vocabulary basis (sha256(belief ‖ nonce ‖ vocab)): since resolve() reconciles semantics over each record's vocab, relabelling that coordinate basis would reinterpret the sealed semantic values without touching them — folding the basis into the commitment makes that relabelling break the seal → flagged tampered (test_vocab_tampering_is_detected).
  2. ed25519 signature over (eval_text ‖ commitment) — a real cryptographic authorship binding that a metadata-controlling adversary cannot forge without the agent's private key. Each ResonanceBFT derives a deterministic signing key from its seed (RFC 8032 signatures are deterministic, so reproducibility holds), signs at participate(), and resolve() verifies. Binding the commitment (not just the text) means a swapped vector — which forces a new commitment to pass the seal — also breaks the signature, closing the vector-swap attack. The signer re-signs after a legitimate vocab-extension re-embed (which changes the commitment), so honest agents stay valid.

Honest scope. Minting the public key ↔ agent identity binding (the PKI step) is delegated to the stack's identity layer (ed25519_rotating) — by the layer separation, a coordination plugin should consume that binding, not reimplement the PKI. And it does: when the identity layer supplies round.metadata["identity_pubkeys"], resolve() rejects any record whose pubkey doesn't match the bound key for its aid (so a metadata adversary cannot substitute a self-signed record). By default a record for an agent with no bound key is accepted (the layered default); a security-conscious deployment can set round.metadata["require_identity_binding"] = True for fail-closed mode, where any record lacking an identity binding is rejected. Absent the map entirely, the plugin still provides cryptographic authorship + tamper-evidence, and full anti-equivocation is identity-layer + this signature together. We state this rather than overclaim.

The commitment seals all five belief axes the commit weights — semantic + affective + epistemic + behavioral + relational_sealed (the participate-time relational, never the deliberate()-mutated copy):

belief_vec = (semantic + affective + epistemic + behavioral + relational_sealed)
commit = sha256(f"{belief_vec}:{nonce}".encode()).hexdigest()

This is deliberate: relational/epistemic/behavioral carry 0.55 of the pentadic quorum weight, so sealing only semantic+affective would let a metadata-controlling adversary rewrite those three axes to inflate or evict a quorum member with no tamper flag. Sealing all five closes that hole.

Why not seal the full combined vector? deliberate() is legitimately allowed to update the combined position (HK bounded-confidence updating). Sealing combined would flag every deliberating agent as a tamperer — and deliberate() specifically overwrites rec["relational"] from its own trust view, which is exactly why the seal (and the commit) read the immutable relational_sealed copy instead. The sealed axes are where sycophancy actually occurs: an agent changing what it claims to believe or how certain it claims to be after seeing the leader's position.

The tamper check recomputes the seal over all five axes (a malformed/incomplete record raises on reconstruction and is treated as tampered, not a crash):

expected = sha256(_sealed_belief(rec), rec["nonce"])   # all five axes
if expected != rec["commitment"] or not valid_signature(rec):
    tampered.append(aid)

validate_bft_no_equivocation re-derives the same five-axis seal during post-run validation.

Equivocation → a transferable conflict certificate (the clean slice of accountability). Detection alone only tells you an agent lied. When an agent signs two distinct commitments for the same (round_id, aid) — the classic "tell A one thing, B another" — the validator now also emits an equivocation conflict certificate: a self-contained bundle of the two conflicting signed records. Because each is an ed25519 signature over (eval_text ‖ commitment ‖ round_id ‖ aid), anyone — who was never present and trusts no one — can verify it with only the agent's public key (verify_equivocation_certificate), so "we suspect X" becomes "here is X's own signature convicting it, twice." This is deliberately the one Byzantine fault with clean, provable attribution; we implement it (following Sheng et al. 2021, BFT Protocol Forensics, arXiv:2010.06785) rather than pay for full Polygraph-style O(n²) accountability the small-cluster setting does not warrant. Statistical faults (a single consistent-but-dishonest belief) remain the domain of the outlier / trust / social-audit layers — we do not overclaim a cryptographic proof where none exists.

BFT-agreement: the commit certificate is resolver-independent

The coordinate-wise trimmed mean stays inside the honest per-coordinate box despite a Byzantine extreme, while the plain mean is dragged out — box validity at n greater or equal 3f+1

Bar chart: two colluders drag the plain mean to cosine 0.55 below the 0.60 threshold, but the trimmed mean stays aligned at 0.99

A subtle BFT trap: if resolve() derives the quorum from anything in the resolver's private state, two honest nodes with divergent histories could disagree — a violation of BFT agreement. The commit must be a pure function of shared replicated state. ResonanceBFT closes this on three paths — every input to the quorum gate is either shared round metadata or a fixed protocol constant; nothing resolver-local enters it:

  1. Threshold + axis weights → fixed protocol constants, NOT the adaptive L2/L3 values. resolve() gates the quorum with self._commit_threshold (the configured threshold, identical cluster-wide) and scores with the fixed seed _AXIS_WEIGHTS — it never reads the learned self._store.threshold (L2) or self._store.axis_weights (L3), which are resolver-local and diverge as each node learns. The adaptive values still shape deliberation and are surfaced as observability (adaptive_threshold / adaptive_axis_weights in metadata), but they are structurally barred from the commit. This makes the headline invariant — "L2/L3 never alter the L1 commit certificate"literally true in code rather than a slogan.
  2. Centroid → a Byzantine-robust coordinate-wise TRIMMED mean of the sealed axes over the non-tampered records. No reputation, no trust, nothing resolver-local enters the centroid:
    honest  = {aid: rec for aid, rec in evaluations.items() if aid not in tampered}
    ax_vecs = [sealed_axis(rec, ax) for rec in honest.values()]
    trim    = f                                        # configured fault bound ⌊(n−1)/3⌋, per axis
    centroid[ax] = trimmed_mean(ax_vecs, trim)        # drop f extremes/side → box-valid at the n−f floor
    An earlier version weighted the centroid by global reputation; even though reputation is replicated SMR state, a lagging or partitioned node with a stale map could compute a different centroid — so we removed all resolver-local weighting. The remaining question is robustness to a valid (correctly sealed and signed, hence not flagged tampered) but biased minority: a plain mean has breakdown point 0 — even one extreme honest-looking vector shifts it. We therefore use the coordinate-wise trimmed mean (Yin et al., ICML 2018, arXiv:1803.01498): per axis, drop the configured fault bound f = ⌊(n−1)/3⌋ of largest and smallest values, then average. This raises the breakdown point toward the honest majority while staying deterministic — it sorts per-axis values, never agents, so the centroid is still a pure function of the sealed vectors and provably identical across honest resolvers. Trimming by f (not ⌊(k−1)/3⌋ of the arrived records) is what makes box validity hold at the n−f quorum floor: up to f of the k committed records can be Byzantine, and box validity requires trim ≥ (Byzantine present), so ⌊(k−1)/3⌋ would under-trim exactly there (n = 7, f = 2, k = 5: ⌊(k−1)/3⌋ = 1 < 2 leaves one biased extreme in the box). _trimmed_mean caps the trim at ⌊(k−1)/2⌋ so ≥1 value always survives (at the floor k = 2f+1 this leaves the coordinate median — still box-valid). For honest, well-aligned rounds the dropped values sit right next to the mean, so the trimmed and plain centroids are near-identical and the committed quorum is unchanged in practice (the suite's honest-round tests still pass); the trim only bites when a value is genuinely extreme — the biased-minority case (test_resists_biased_minority_better_than_plain_mean), and the quorum-floor case (test_box_validity_at_quorum_floor_trims_by_configured_f: two valid-but-biased records at n = 7, k = 5 cannot move the committed centroid — a regression that fails under the old ⌊(k−1)/3⌋ trim). Excluding tampered records and trimming biased extremes together mean neither a forged vector nor a valid-but-skewed one can drag the consensus point.
  3. Axis vectors → each agent's own sealed values, not recomputed from the resolver's private state. resolve() does not rebuild relational/epistemic/behavioral from self._trust_matrix / self._past_semantics / self._get_behavior (resolver-local). It reads each agent's sealed axes from the shared metadata — relational specifically from the immutable relational_sealed, never the deliberate()-mutated rec["relational"] (that mutation was the live bug behind the earlier resolver-agreement gap).
  4. Membership size is a non-lowerable floor, not whatever the metadata claims. quorum_needed = n − f is only safe if n cannot be shrunk by an adversary who controls the shared round metadata. resolve() computes n = max(present, metadata["expected_n"], self._expected_n, len(roster)) — the max over every source the resolver knows locally, including its own constructor expected_n. So a Byzantine proposer can only ever raise the bar (a genuinely larger cluster), never lower it: a stripped or under-stated expected_n in the metadata cannot make a 4-of-7 partition believe it is the whole cluster and commit alone. Verified by test_metadata_cannot_lower_configured_membership (metadata claims n=4, roster stripped, only 4 of 7 present → the resolver configured with expected_n=7 still requires quorum 5 and aborts, no split-brain) and test_uses_constructor_expected_n_when_metadata_missing.

So every honest resolver derives the identical per-agent similarity, applies the identical fixed threshold, and reaches the identical quorum. Verified by test_commit_is_resolver_independent_under_divergent_trust, which diverges all three resolver-local inputs — private per-dyad trust, the adaptive L2/L3 threshold + axis-weights, and reputation — between two participant resolvers, runs deliberate() on each with its own trust first, then asserts byte-identical similarities, quorum_agents, status, and reported commit params. (Three earlier iterations silently masked the dependence: external empty-state resolvers; then diverging only trust while the commit still read adaptive weights; then a commit that still read reputation — all three fixed.)

This is approximate agreement over a value space — and that is the correct model, stated precisely. ResonanceBFT agrees over a continuous five-axis value vector, not a discrete bit. The rigorous notion for consensus over vector values is Multidimensional Approximate Byzantine Agreement (MBAA) — honest outputs within ε of each other and inside the range of honest inputs (validity) — not exact agreement (Dolev et al., JACM 1986; Vaidya & Garg, PODC 2013, arXiv:1302.2543; Mendes–Herlihy–Vaidya–Garg, Distributed Computing 2015). Demanding exact agreement on a discrete value is the wrong model for a value space; MBAA is the right one, and it is not a weaker form of BFT — it is the established formalism for this domain.

Under MBAA, ResonanceBFT's guarantees are:

  • Validity = box (trusted-hyperbox) validity, not convex validity — and that is a deliberate, better choice. The commit's aggregate is the raw coordinate-wise trimmed mean (_trimmed_mean), each coordinate of which lies within the honest per-coordinate range ([f-th smallest, f-th largest]); because we trim by the configured f ≥ the Byzantine count present, a Byzantine minority cannot push that raw aggregate outside the honest value space. Verified by test_byzantine_extreme_cannot_push_aggregate_out_of_honest_box. Precisely: the committed direction is this aggregate normalised (_trimmed_centroid = _normalise(_trimmed_mean)), and the commit gates on cosine similarity (direction only), so the box-valid raw aggregate is what bounds where the commit points — we do not claim the normalised vector's coordinates themselves stay in the box (normalisation can move them; box validity is a property of _trimmed_mean, asserted on that helper). This is box validity; a coordinate-wise aggregate provably cannot give convex validity in d ≥ 2 (Cambus & Melnyk, arXiv:2306.12741). We do not claim convex validity — and choosing box over convex is the better engineering point on both axes their paper proves: convex validity forces a ≥ 2d = 10 worst-case centroid error and needs n ≥ (d+2)f+1 = 7f+1 at d=5, whereas box validity gives 2√5 ≈ 4.47 at the standard n ≥ 3f+1. Convex validity via the SafeArea construction is available but costs half the fault tolerance for a worse centroid — so we consciously do not adopt it.
  • ε-agreement across views. Two resolvers on the same sealed view commit identically (exact). Across different overlapping quorum views the per-view centroids differ by a bounded ε (a function of quorum overlap and honest diameter; Dolev's contraction factor). The additional invariant that holds is no fork on the shared core: no agent visible to both views is committed in one yet rejected in the other. Verified by test_divergent_quorum_views_do_not_fork_shared_core (two overlapping 6-of-7 views both commit; every shared agent classified identically; shared-core divergence < 0.05). The per-view winner is a representative of the agreeing quorum, not a globally-canonical value — which is exactly what MBAA prescribes.
  • The HK deliberation is the iterative-averaging convergence engine — and it is tested. Iterative approximate Byzantine consensus (drop f extremes, average) is stochastic averaging — the same structure as Hegselmann–Krause opinion dynamics (Vaidya, arXiv:1203.1888), with polynomial convergence in fixed dimension (Chazelle & Wang, ITCS 2013). test_deliberation_contracts_opinion_diameter verifies the honest opinion diameter is monotonically non-increasing across deliberation steps and strictly converges (e.g. 1.61 → 1.34 over four steps) — the Dolev contraction realised. Deliberation need only converge; the committed trimmed-mean aggregate carries the box-validity guarantee. This directly targets the failure mode Berdoz, Rugli & Wattenhofer (2026, "Can AI Agents Agree?", arXiv:2603.01213) found dominant in LLM Byzantine consensus — loss of liveness / stalled convergence — which ResonanceBFT engineers against with the n−f commit plus this bounded-round convergence engine.

So the precise claim is: box-validity MBAA over a five-axis value space, with ε-agreement across overlapping quorum views, at n ≥ 3f+1 — a named, proven correctness notion, honestly the correct (not over-claimed) one for consensus over values.

Tradeoff (stated honestly): each agent's sealed relational reflects its participate-time view (uniform for the first participant), so the relational axis does limited discriminating work at commit — the price of resolver-independence. Semantic + affective + epistemic + behavioral carry the quorum decision. When no roster is provided the relational axis is a purely neutral constant (identical for every agent); scoring it with its 0.25 weight would add a flat +0.25 to every pentadic score and silently loosen the threshold, so in that case resolve() drops relational from the commit and renormalises the four informative axes to sum to 1. The threshold then gates only on axes that actually carry information, and the "genuinely five-axis" claim stays honest — relational counts toward the commit only when a roster makes it discriminating (verified by test_neutral_relational_does_not_inflate_pentadic).

Byzantine centroid dampening now lives where it belongs — outside the L1 commit. Because the commit centroid is a trimmed mean over non-tampered records, BFT safety rests on the n−f quorum gate plus the trim's breakdown bound (not on resolver-local centroid weighting). The reputation-gap dampening still operates in deliberation (whose voice pulls the group) and in outlier/tamper penalties across rounds — its full mechanism is in the Byzantine centroid dampening section below.


Deliberation, authenticity & robustness — L2 / L3

Everything in this part sits on top of the commit and, by the load-bearing invariant, never changes it. It is what turns a bare quorum into a genuine-vs-superficial judgement — and what keeps the network robust to newcomers, Byzantine entrants, and long-horizon drift.

Deliberation: Hegselmann-Krause with per-axis trust and per-dyad ε

Bounded-confidence (Hegselmann-Krause) averaging monotonically contracts the honest opinion diameter each step, 1.61 to 1.34 — the convergence guarantee

deliberate() runs bounded-confidence (HK 2002) position updates:

# Per-dyad adaptive ε: established pairs get wider neighborhood
eps_ij = base_epsilon + 0.05 * min(co_commits[i,j], 5)  # max +0.25 boost

# Per-axis centroid using axis-specific trust weights
for axis in AXES:
    ax_centroid[axis] = weighted_centroid(
        neighbors[axis_vecs], weights=axis_trust[me][neighbor][axis]
    )

Idempotency: a snapshot of pre-deliberation positions is stored in round.metadata["_pre_deliberation_snapshot"]; calling deliberate() twice restores the snapshot and re-runs, producing the same result.

Trajectory classification (9 types)

An LSTM-like classifier reads the sequence of five-dimensional belief states over four rounds to name the trajectory type, e.g. genuine versus capitulated

After deliberation, the trajectory is classified into one of nine consensus types based on five signals — velocities, concession_symmetry, axis_deltas, evidence_delta, and coalition history:

Type Signal pattern
genuine bilateral, deep, rising epistemic confidence
capitulated asymmetric movement, falling epistemic confidence
coerced fast convergence, extreme asymmetry, no evidence signal
logrolled opposite-sign axis deltas (cross-axis trade)
coalitional fast + low evidence_delta + median co-commits ≥ 3
fragile threshold barely met (depth < 0.05)
polarized diverging velocities throughout
deadlock all velocities near zero
unknown insufficient data

Evidence delta (Agarwal & Khanna 2025, arXiv:2504.00374): tracks whether the epistemic component of each agent's position was pulled toward higher-confidence peers (persuasion) or lower-confidence peers (social pressure). Distinguishes genuine from capitulated even when concession patterns look symmetric.

Coalitional shortcut (Leifeld & Brandenberger 2024): uses median co-commit count across all pairs (not minimum), so a single stranger pair in a group of veterans doesn't veto the coalitional label.

Time-decayed dyadic trust with per-axis overlay

trust[i→j] *= 0.92   # per-round exponential decay (half-life ≈ 8 rounds)
trust[i→j] += 0.18   # co-committed last round
trust[i→j] -= 0.45   # outlier or tampered

Per-axis trust (_axis_trust[me][other][axis]) tracks faceted reliability: an agent can be trusted for semantic content but less so for behavioral follow-through. Used to weight per-axis centroids in deliberate() (L2); the L1 commit centroid in resolve() is unweighted, so this trust never enters the certificate.

Non-obvious invariant: Byzantine centroid dampening (pre-detection)

This governs the deliberation centroid (_asymmetric_weight), not the L1 commit — the commit centroid is an unweighted mean over non-tampered records (see BFT-agreement above), so commit safety rests on the n−f quorum. Where reputation does legitimately shape influence — whose voice pulls the group during deliberate() — a new Byzantine agent has far lower influence than an established veteran before any tampering is detected. The dampening mechanism is the reputation gap, evaluated through the real local-only outbound weighting weight[j] = rep(j) × trust(me→j):

weight[byz] = rep_init(1.0)        × trust(me→byz)(default 1.0) ≈ 1.0
weight[vet] = rep(≈1 + 0.12·rounds) × trust(me→vet)(earned ≈0.8) ≈ 5.6   (after ~50 commits)

A veteran who has earned reputation over many committed rounds has multiples of a newcomer's centroid weight, so a brand-new Byzantine entrant cannot capture the centroid before its tampering is detected and its reputation collapses. Verified by test_byzantine_centroid_weight_dampened, which now exercises the real _asymmetric_weight (not a hand-rolled model) across n_honest ∈ [4,10].

This is exactly why newcomers are not seeded to the network-median reputation: doing so would erase the gap and hand a stranger veteran-level pull. The obvious-looking "be nice to newcomers, start them at the median" alternative is a pre-detection Byzantine-amplification bug; we reject it deliberately (see cold-start below).

Cold-start: integrating a new agent without an isolation spiral (or a Byzantine hole)

A genuinely new agent arrives with no reputation, no dyadic trust, and no co-commit history. Left unaddressed, one early divergence triggers a destructive feedback loop: outlier → trust penalty → lower neighborhood inclusion → more divergence → permanent isolation. But the naive fix (inflate the newcomer's standing) re-opens the Byzantine hole above. ResonanceBFT threads this with two asymmetric, safety-preserving mechanisms:

  1. Trust grace period (_is_newcomer). For the first _NEWCOMER_GRACE_ROUNDS (3) outcomes an observer has seen a peer in, no local dyadic-trust penalty is applied for being an outlier — breaking the cascade at its source. The signal is the observer's own peer_encounters counter, the only newcomer signal a distributed node actually maintains (a peer's own participation count lives in the peer's store, not yours). Reputation is never waived (it is the global, observable misbehavior signal), and tampering is never waived (deliberate attack ≠ innocent divergence).
  2. No reputation inflation. Newcomers keep rep = _REPUTATION_INIT, preserving Byzantine centroid dampening. Voice is earned through committed rounds, not granted on arrival.

Crucially, BFT safety is untouched: the grace period changes how fast private trust moves, never the commit certificate size (quorum_needed = n − f). A newcomer — honest or Byzantine — is still one vote among n and cannot force a commit. Verified by test_newcomer_trust_protected_for_grace_rounds, test_persistent_outlier_loses_trust_after_grace (the penalty does fire once grace expires), test_tampering_penalised_even_during_grace, and test_newcomer_reputation_stays_low_for_byzantine_dampening.

Sybil guard & view-change

Sybil guard — idempotent participate(). An agent that calls participate() twice cannot double-vote or overwrite its sealed commitment:

await agent.participate(rnd)  # registers evaluation
v2 = await agent.participate(rnd)  # returns cached commitment, no mutation
assert v2.metadata["sybil_guard"] is True

The guard is enforced at the round-metadata level, so it works even if multiple coroutines race to call participate on the same round object.

View-change — round-robin leader rotation. On abort, the next proposer is chosen deterministically so any agent can verify the expected leader without coordination:

proposer = all_agents[view_number % len(all_agents)]

Carries view_change: True + new proposer in metadata.

Honest scope of the view change. This is a simplified round-robin re-proposal on abort, not a full PBFT/HotStuff view-change protocol. Two honest properties hold: (a) a view change only follows an abort — a round that reaches an n−f quorum commits, so it never triggers a new view, hence at most one view commits per decision; and (b) each re-proposal is a fresh Round with a new round_id. The consequence, stated rather than hidden: because views carry distinct round_ids, validate_bft_no_conflicting_commits (which groups by round_id) enforces the no-fork property within a view, and cross-view safety rests on property (a) rather than on a committed-value carry-forward across views. A full view-change protocol that links views of one decision and forwards the highest committed certificate is deliberately out of scope for this reference plugin — we do not claim it.

Adaptation, observability & replay (long-horizon detail → appendix)

Three capabilities round out the L2/L3 stack. All three are deliberately kept out of the L1 commit path and, in the short graded scenarios, are largely dormant — so their full detail lives in the appendix, with the essentials summarised here:

  • Three-timescale adaptive weights (L3). Twelve parameters self-calibrate across three timescales (per-round trust/ε · per-epoch threshold/base-ε · slow axis-weights via Exponentiated Gradient), each preserving a machine-checked BFT invariant (simplex · gain-stability · ε-bounds · threshold clamp to the safety lower bound). They shape deliberation only — the L1 commit reads fixed constants — and are a no-op at the seed weights, so short runs stay byte-identical and the effect emerges only over long, warmed-up simulations. → full parameter tables, stability algebra, and clamp: Appendix D.
  • Observability — pentadic_summary() / sycophancy_score(). First-class analytical instruments (not just debug prints): one renders the per-axis alignment table from a resolved outcome, making the five-axis claim falsifiable; the other scores each agent's persuasion-vs-pressure drift after deliberation. → rendered report + falsifiability tests: Appendix B.
  • Replay — TrustStore.snapshot() / restore(). Deep-copies every memory layer so a researcher can rewind and re-run deliberation under different parameters on identical state. → Appendix C.

Tradeoffs, evidence & verification

The design is opinionated; this part states what was traded away, then shows the protocol actually committing — first through the town runner with scripted stances, then with real LLM opinions at scale — and how to reproduce all of it.

Tradeoffs

Design choice Alternative Why this
Seal all five belief axes (not combined) Seal combined deliberate() legitimately updates combined; sealing it flags every participant as tampered. The five belief axes carry the quorum weight, so all five are sealed; sycophancy lives in belief revision, not position updates.
Weighted pentadic similarity (_AXIS_WEIGHTS) Full-vector cosine 64-dim semantic swamps 2-dim axes (91% effective weight). Explicit weights make the five-axis claim honest.
Local outbound trust trust[me→j] Inbound trust mean(trust[k→j]) Inbound trust requires reading all peers' private state — non-local, only works in simulation.
Median co-commits for coalitional Minimum co-commits Min is vetoed by a single stranger pair. Median preserves genuine alliance signal across the majority.
Stability = 0.5 for new agents Stability = 1.0 position_stability=1.0 gives unearned epistemic authority; 0.5 is the neutral prior.
IDF-style vocab (appearance order, len > 2) Most-frequent-first Frequency-sort favors function words that slipped the stop filter; appearance order in context is a better content-word proxy.
Snapshot + restore in deliberate() Stateful one-shot Makes deliberate() idempotent; safe to call multiple times (e.g., with different step sizes) without corrupting the round state.
quorum_needed = n−f quorum_needed = 2f+1 2f+1 only guarantees honest intersection when n = 3f+1 exactly. n−f is correct for all n ≥ 3f+1.
BoW TF semantic embedding LLM embedding Deterministic, no network calls, fixed vocabulary from task description + participant private texts.
threshold = 0.60 threshold = 0.90 In 5D joint space, inter-agent similarity is lower than in 1D; 0.60 empirically passes Hypothesis invariants for n ∈ [4, 12].

End-to-end: consensus actually driven through the town runner

Earlier revisions disclosed honestly that the framework's generic consensus scenario is a toy leader/follower vote that ignores the coordination plugin — so no town run actually exercised ResonanceBFT; the protocol was covered only by the unit/property suite. That gap is now closed.

nest_plugins_reference/scenarios/resonance_bft_consensus.py registers (via the public register_scenario API — no core changes) a scenario whose agents drive the real protocol over the simulator's in-memory transport: the leader propose()s a Round and seals its evaluation, each follower participate()s and returns a self-contained Vote carrying its full sealed record in Vote.metadata (so a generic transport can drive resolve() from the votes alone, not only via the shared Round.metadata the single-process runner mutates — test_votes_are_self_contained_for_generic_transport), the leader resolve()s the n−f quorum and commit()s, then broadcasts the committed Outcome so every agent applies commit() and adapts its own trust — genuine multi-agent L3 adaptation, not leader-only (verified in the e2e test by follower O| receipts). Each agent gets its own ResonanceBFT instance from the runner-resolved class — exactly how the framework instantiates a coordination plugin. The default BoW semantic axis is transport-safe: each record carries the vocabulary it was embedded over and resolve() remaps every record onto the canonical union vocab, so followers that extended the vocab with different private words are still compared on aligned coordinates (test_transport_divergent_vocab_not_falsely_aligned).

Running scenarios/resonance_bft_consensus.yaml (12 agents × 6 rounds, dense embedding on the semantic axis) through the real ScenarioRunner produces this trace — the protocol's own per-round output, not lifecycle events:

round topic stances status quorum (n−f) false_agreement
1 budget unanimous committed 9/9 0.00
2 timeout split committed 9/9 0.56
3 rollout unanimous committed 9/9 0.00
4 schema split committed 9/9 0.56
5 indexing unanimous committed 9/9 0.00
6 caching split committed 9/9 0.56

The driver resolves at the n−f quorum (9 of 12), not at unanimity — it commits as soon as a quorum of sealed evaluations arrives, so slow or silent agents never block liveness. The split rounds are the interesting ones: agents that say "approve" and "reject" the same proposal are cosine-close on the semantic axis, so the quorum still forms (a false consensus by topic similarity) — but the antonym-anchored stance audit (_polarity.py, grounded in the Linear Representation Hypothesis / SensePOLAR) flags false_agreement ≈ 0.56, while genuine unanimous rounds read 0.00. This is the embedding scheme and the stance audit running end-to-end in a live multi-round town simulation, not in a unit test. The audit is diagnostic only — it never gates the n−f commit certificate (the load-bearing L2/L3 never alter L1 invariant), and because it reads only sealed vectors + a fixed direction it stays resolver-independent.

Fault tolerance, demonstrated through the runner. scenarios/resonance_bft_consensus_faulty.yaml (7 agents, silent: 2 — two crashed/partitioned followers that never respond) still commits at quorum = 5/5 = n−f from the 5 responders, proving the n−f tolerance in a live run, not just the unit suite (test_resonance_bft_commits_at_quorum_despite_silent_agents). Conversely, under a genuine partition scenarios/resonance_bft_consensus_partition.yaml (4/3 split, only 4 reachable, quorum_needed = 5) produces no commit while partitioned — liveness requires a sufficient quorum. All three are CI-verified by tests/test_resonance_bft_e2e.py, which runs them through the real ScenarioRunner and asserts on the consensus events in the trace.

The same run with a real no-torch encoder swapped in (embed: model2vec, or fastembed's bge-small via ONNX) confirms the benchmark's prediction in the live town: the contextual encoder (fastembed) flags the split rounds identically (false_agreement = 0.55), while the static encoder (model2vec) commits the same rounds but reads 0.0 — static embeddings cannot separate stance (matching the opp_sep_rate 1.00 vs 0.67 in BENCHMARKS.md). The mechanism, the benchmark, and the live multi-round simulation all agree.

Perception-to-signal pipeline with mini bar charts: an encoder separates stance (fastembed 1.00 vs static 0.67), fixed axis weights stop dimensionality domination, and the false-consensus audit flags same-topic opposite-stance rounds (0.55 vs 0.0)

The recommended real encoder is therefore fastembed bge-small — a contextual, attention-based transformer run via ONNX Runtime (no PyTorch). This distinction is load-bearing, not cosmetic: model2vec is a static embedding (distilled token vectors looked up and averaged — no attention at inference), which is precisely why it cannot separate stance; the attention-derived representation is what makes the polarity direction linear (Park et al. 2024) and the audit fire. For stance specifically, we measured the cost of staying single-vector against a no-torch ONNX NLI cross-encoder (Xenova/nli-deberta-v3-small, which reads both utterances jointly with cross-attention): the cheap linear probe ties it on explicit approve/reject (6/6 each), but the NLI model wins 6/6 vs 0/6 on hard stance — negation, implicit, and off-axis opinions the probe cannot reach. So the probe is the right resolver-independent in-protocol audit, and the NLI cross-encoder is the optional high-accuracy analyzer — both no-torch (examples/resonance_bft_embeddings/nli_vs_probe.py).

Honest scope: this is a single-process, in-memory-transport drive (not multi-host networking), but it is the plugin genuinely reaching — and committing — consensus inside a town run, over the message bus, across many rounds.

Real-LLM town evidence: actual agents, real models, across four subscription backends

Five town scenarios by four models: every quorum-slack case commits identically, byzantine commits with tampered equals 2, partition never commits — model-agnostic

The scenarios above use scripted stances so they are deterministic and CI-runnable. To confirm the town commits with real model-generated opinions, examples/llm_consensus/evidence_town.py drives the same committed scenario YAMLs through the same ScenarioRunner, but each honest agent's opinion is produced by a real LLM via a key-free subscription CLI (Claude Code claude, Codex codex, Antigravity agy), injected through a generic, default-off scenario hook (task.config["opinions"]; with no opinions supplied the scenario is byte-for-byte the shipped scripted behaviour, so this adds no runtime dependency and cannot affect plugin behaviour). It is live-marked and excluded from CI (-m "not live"), so it never gates the submission.

A full matrix — 5 scenarios × 4 tiers × 3 reps = 60 real town runs, lowest/fastest models only (the deterministic core needs no frontier model) — is recorded in examples/llm_consensus/EVIDENCE.md. The four tiers, with exact model ids (full config + metrics in examples/llm_consensus/MODELS.md): mock (deterministic), claude:haiku = Anthropic claude-haiku-4-5, codex = OpenAI gpt-5.5 (model_reasoning_effort=low), and agy = Antigravity fronting Google Gemini 3.5 Flash (Low thinking tier)agy is a CLI, not a model. Result: every scenario with quorum slack reached its designed outcome on every model and every rep — byzantine commits with tampered=2, partition never commits, bag-of-words commits — so the decision is model-agnostic: the LLM changes the opinions and the audit's consensus_type, never the commit rule. Real-LLM testing also surfaced an honest liveness nuance the scripted mock masks: at the exact quorum floor (silent-crash, present == n−f, zero slack) a single genuinely-divergent real opinion can prevent assembling a coherent quorum, so the round safely does not commit — correct safety (never commit an incoherent quorum), and a sensitivity that scenarios with even one seat of slack absorb every time. This is the "transformer-at-the-edge, deterministic core" thesis demonstrated on real models, not just asserted.

A second, larger harness (examples/llm_consensus/evidence_scale.py, report in EVIDENCE_LARGE.md, with a one-glance dashboard in EVIDENCE_LARGE.svg) enlarges four dimensions independently — 88 real-town runs (80 + 8 cross-model at n=13,25) — and sharpens each claim:

Bar chart of commit quorum at n equals 4, 7, 13, 25, 49 — all commit with quorum equal to n minus f, on three independent models

  • Scale. The real town commits BFT at n = 4, 7, 13, 25, 49 (f = ⌊(n−1)/3⌋ up to 16), with the commit quorum tracking n−f exactly at every size — e.g. Gemini 3.5 Flash commits 33/33 at n = 49, and claude-haiku-4-5 and gpt-5.5 both commit at n = 13 (9/9) and n = 25 (17/17) too, so scaling is confirmed across three independent models, not one. The deterministic core scales; only the opinions come from the model.
  • Topic-independence. Across 8 unrelated questions (policy, tech, ethics), every round commits — commit is a function of the sealed vectors, not the subject; only the audit's consensus_type shifts with how opinions cluster.
  • Quorum-floor liveness, quantified. At the exact present == n−f floor (zero slack), across 10 reps × 3 models: agy 10/10, claude 9/10, codex 9/10 commit. The ~1-in-10 non-commit is the protocol safely refusing an incoherent quorum — the same nuance the small run surfaced, now with a rate.
  • Multi-round L3 evolution. Over 8 rounds every round commits, and the per-round consensus_type warms up fragile → genuine as the Layer-3 trust/weights adapt on real opinions — while the L1 n−f certificate stays untouched.

The harness persists every run and supports --resume, so this matrix completed across several interruptions with zero repeated work; it is live-marked and never runs in CI.

Verification

Test pyramid: 262 unit and property tests, 51 Byzantine validators, 8 end-to-end town runs, and 1 live real-LLM test at the apex — 321 plugin tests, CI-clean

The full edge-case matrix and the machine-checked invariant list are in Appendix E and Appendix F; the package layout is in Appendix A.

Run all tests:

# From the repo root — runs the full CI gate suite:
uv run ruff check . && uv run ruff format --check . && uv run pyright && uv run pytest -q
# ruff: All checks passed · pyright: 0 errors · pytest: 1057 passed, 1 skipped, 2 deselected
# (this plugin contributes 321 tests: 262 test_resonance_bft + 51 test_bft_validators +
#  8 test_resonance_bft_e2e; plus a live-marked real-LLM town test excluded from CI.
#  The optional-ML-dep embedding example scripts require requirements-embeddings.txt and are
#  excluded from the pyright gate — the plugin and all tests type-check strict-clean.)

Verify plugin is discoverable:

uv run python -c "
from nest_core.plugins import PluginRegistry
r = PluginRegistry()
cls = r.resolve('coordination', 'resonance_bft')
print('Found:', cls)
# Found: <class '...coordination.resonance_bft.ResonanceBFT'>
"

Run ResonanceBFT — full pipeline including deliberation:

import asyncio
from nest_core.types import AgentId, Task
from nest_plugins_reference.coordination.resonance_bft import ResonanceBFT

async def demo():
    task = Task(id="t1", description="select routing model safely", requirements=[])
    agents = [ResonanceBFT(AgentId(f"a{i}"), seed=i) for i in range(7)]
    rnd = await agents[0].propose(task)
    for agent in agents:
        await agent.participate(rnd)
    # Deliberation: 3 rounds of HK bounded-confidence (idempotent)
    traj = await agents[0].deliberate(rnd, steps=3, epsilon=0.15)
    print(f"consensus_type={traj.consensus_type}")  # e.g. genuine/fragile/coalitional
    outcome = await agents[0].resolve(rnd)
    print(f"status={outcome.metadata['status']}")
    print(f"quorum={outcome.metadata['quorum_size']}/{outcome.metadata['quorum_needed']}")
    print(f"tampered={outcome.metadata['tampered_agents']}")  # [] for honest rounds
    print(f"conflict_type={outcome.metadata['conflict_type']}")
    for agent in agents:
        await agent.commit(outcome)

    # Human-readable per-axis breakdown
    from nest_plugins_reference.coordination.resonance_bft import pentadic_summary, sycophancy_score
    print(pentadic_summary(outcome.metadata))

    # Sycophancy diagnosis: who moved toward social pressure vs genuine persuasion?
    scores = sycophancy_score(traj.evidence_delta)
    pressured = [aid for aid, s in scores.items() if s < -0.01]
    print("Pressure-driven agents:", pressured)  # [] in genuine consensus

asyncio.run(demo())
# consensus_type=genuine
# status=committed
# quorum=7/5
# tampered=[]
# conflict_type=none
# ╔══════════════════════...
# Pressure-driven agents: []

Test that deliberation does NOT break the tamper check (the key commit correctness fix):

# Prior bug: deliberate() overwrote combined → everyone flagged as tampered in resolve()
# Fixed: commitment seals the five immutable belief axes (incl. relational_sealed), never combined
await agents[0].deliberate(rnd, steps=3)
outcome = await agents[0].resolve(rnd)
assert outcome.metadata["tampered_agents"] == []  # ✓

Test Byzantine agent is detected when it changes its semantic belief:

# Byzantine: change semantic belief after seeing others (equivocation)
rnd.metadata["evaluations"]["spy"]["semantic"] = [0.0] * 64
outcome = await agents[0].resolve(rnd)
assert "spy" in outcome.metadata["tampered_agents"]  # ✓

Run adversarial validators:

from nest_plugins_reference.validators import (
    validate_bft_no_conflicting_commits,
    validate_bft_no_equivocation,
    validate_bft_no_forged_quorum,
    validate_bft_liveness_view_progress,
)

# All pass on ResonanceBFT outcomes
result = validate_bft_no_forged_quorum([outcome])
assert result.passed, result.detail

# All fail on contract_net outcomes (missing BFT metadata)
result = validate_bft_no_forged_quorum(contract_net_outcomes)
assert not result.passed  # "outcomes missing BFT quorum metadata (protocol mismatch?)"

Use existing consensus scenario:

# scenarios/consensus.yaml — change one line:
layers:
  coordination: resonance_bft   # was: contract_net

Hardening in this revision. Adversarial LLM-judge review (gpt-5.5-pro at high reasoning) surfaced three latent defects, each now fixed with a regression test:

  • Partition split-brainresolve() derived n from received evaluations, letting a partitioned minority lower its own quorum bar. n is now the fixed cluster membership (max(present, expected_n), sourced from all_agents, task.metadata["expected_participants"], or the expected_n constructor arg).
  • False tamper on vocab growth — an earlier approach re-embedded prior agents' semantic vectors when a late participant extended the vocab, invalidating their sealed commitments and falsely flagging them. Fixed at the root: the vocab is append-only and prior agents are never re-embedded — their shorter vector is a prefix of the longer one and _cosine zero-pads, so their commitment stays valid (only the current agent re-embeds its own text and re-signs).
  • Dead safety clampclamp_threshold() was defined but never called; it now runs after every Layer-2 threshold update, so the adaptive threshold can never erode below the BFT safety lower bound.

Deliberation semantics were also clarified: deliberate() drives trajectory classification and adaptive learning, while the commit certificate is intentionally fixed by the sealed belief axes (anti-sycophancy).

Safety proof sketch (n=7, f=2)

  • quorum_needed = n−f = 5
  • Any two quorums ≥ 5 must share ≥ 5+5−7 = 3 agents
  • At most f=2 of those 3 can be Byzantine → ≥ 1 honest agent in both quorums → no conflicting commits

validate_bft_no_conflicting_commits enforces this mechanically.
validate_bft_no_equivocation enforces that no agent presents different sealed beliefs to different resolvers.



📎 Appendices A–H are posted as the first comment below (GitHub's PR-body size limit). The in-body Contents links for the appendix sections point there.

…orks

Problem projnanda#10 (partition-tolerant BFT consensus). Agents agree on a five-axis
belief vector (semantic / affective / relational / epistemic / behavioral)
rather than a single value, with a strict layer separation:

- L1 safety core: deterministic n-f quorum commit over sealed evaluations.
  Every record seals all five belief axes plus the BoW vocabulary basis
  (SHA-256 commitment) and carries a round- and identity-bound ed25519
  signature; tampered records are excluded before they can touch the
  centroid. The commit certificate is a pure function of shared sealed
  state: fixed threshold, fixed axis weights, a non-lowerable membership
  floor, and a coordinate-wise trimmed mean trimmed by the configured
  fault bound f (box-validity MBAA at n >= 3f+1).
- L2 audit: consensus-quality diagnostics (genuine vs sycophantic
  agreement, evidence-delta, trajectory classification, stance-aware
  false-agreement detection) that never alter the L1 commit.
- L3 adaptation: three-timescale learning (per-round dyadic trust,
  per-epoch threshold/epsilon, slow axis weights via exponentiated
  gradient), structurally barred from the commit gate.

Ships with adversarial validators (equivocation certificates, forged-quorum
and fork detection), a ScenarioRunner scenario driving the real protocol over
the transport with silent / Byzantine / partition / divergent-vocab fault
modes, runnable examples with real-LLM town evidence, an annotated and
verified bibliography, and 18 self-contained architecture diagrams.

321 plugin tests (unit + Hypothesis properties + adversarial + e2e; a
live-marked real-LLM test excluded from CI). ruff + pyright clean.
@srcJin

srcJin commented Jul 2, 2026

Copy link
Copy Markdown
Author

Appendix

Reference material and the finer detail, referenced from the body above. None of it is required to grasp the thesis; all of it is here so the claims are checkable.

Appendix A. Package structure

packages/nest-plugins-reference/
  nest_plugins_reference/coordination/resonance_bft/
    __init__.py       (92 lines)   — re-exports all public symbols
    _vectors.py       (340+ lines) — pure stateless functions (tokenise, embed, affective,
                                     sycophancy_score, pentadic_summary, …)
    _types.py         (93 lines)   — Offer, ConsensusType, ConsensusTrajectory
    _trajectory.py    (242 lines)  — classify + conflict analysis (pure functions)
    _trust.py         (520+ lines) — TrustStore class (all per-agent memory,
                                     snapshot/restore, median_co_commits,
                                     three-timescale adaptive weights)
    _protocol.py      (825 lines)  — ResonanceBFT main class (Sybil guard, HK deliberation)

  nest_plugins_reference/validators/
    bft_validators.py              — 4 adversarial validators (no_conflicting_commits,
                                     no_equivocation, no_forged_quorum,
                                     liveness_view_progress) + equivocation conflict
                                     certificate (build/verify/collect — transferable proof)

  tests/
    test_resonance_bft.py          — 262 tests across 30+ test classes
                                     (unit + Hypothesis property tests + adversarial + Sybil guard +
                                     snapshot/restore + sycophancy_score + pentadic_summary +
                                     adaptive trust params + adaptive ε + adaptive threshold +
                                     adaptive axis weights + cold-start/newcomer + design-fix regression)
                                     Hypothesis covers: simplex invariant, stability invariant,
                                     BFT lower bound, epsilon bounds — for arbitrary inputs
    test_bft_validators.py         — 51 validator tests

  nest_plugins_reference/scenarios/
    resonance_bft_consensus.py     — leader/follower brains that drive the REAL protocol
                                     over the sim transport (propose→participate→resolve→
                                     commit), multi-round, optional dense embedding

  examples/resonance_bft_embeddings/
    benchmark.py, polarity_probe.py, requirements-embeddings.txt  — reproducible,
                                     no-torch semantic-axis + stance benchmarks (see BENCHMARKS.md)

scenarios/
  resonance_bft_wiring_partition.yaml  — 7 agents, 4/3 partition (lifecycle wiring)
  resonance_bft_wiring_byzantine.yaml  — 7 agents, 0.28 Byzantine fraction
  resonance_bft_consensus.yaml         — 12 agents × 6 rounds, dense embedding (commits at n−f e2e)
  resonance_bft_consensus_faulty.yaml  — 7 agents, 2 silent → commits at n−f=5 (crash-fault tolerance e2e)
  resonance_bft_consensus_byzantine.yaml — 10 agents, 2 lying → tampered detected+excluded, honest quorum commits (Byzantine-fault e2e)
  resonance_bft_consensus_bow.yaml     — 7 agents, DEFAULT bag-of-words encoder over transport (vocab reconciled) (default-path e2e)
  resonance_bft_consensus_partition.yaml — 7 agents, 4/3 split (no commit: liveness e2e)

The five resonance_bft_consensus* scenarios give the runner-driven e2e suite real-world coverage of every fault mode the protocol claims to handle: happy-path multi-round (with the dense-embedding stance audit firing), crash faults (silent agents → still commits at n−f), Byzantine faults (agents submitting tampered records → detected, excluded, honest quorum commits), the default zero-dependency encoder over the transport (per-record vocabularies reconciled), and network partition (minority cannot commit — liveness). Each is CI-verified by tests/test_resonance_bft_e2e.py.

Appendix B. Observability report

Two analysis helpers ship alongside the protocol — not just for debugging but as first-class analytical instruments that a cognitive-social-computing researcher would actually reach for after every run:

pentadic_summary(outcome.metadata) renders a structured per-axis alignment table directly from a resolved outcome, without requiring the caller to know the internal metadata schema:

outcome = await plugin.resolve(rnd)
print(pentadic_summary(outcome.metadata))
# ╔══════════════════════════════════════════════════════╗
# ║       ResonanceBFT — Pentadic Alignment Report       ║
# ╠══════════════════════════════════════════════════════╣
# ║  Status: committed   Quorum: 7/5  Threshold: 0.60   ║
# ╠══════════════════════════════════════════════════════╣
# ║  Agent        Sem    Aff    Rel    Epi    Beh     Overall    Align  ║
# ║  a0          0.912  0.834  0.791  0.703  0.881    0.8271  [✓] strong  ║
# ║  a2 (spy)    0.211  0.190  0.133  0.091  0.072    0.1468  [ ] weak    ║

Each agent's per-axis similarity against the (unweighted) commit centroid is shown alongside a qualitative alignment band (strong / moderate / borderline / weak). This makes the five-axis claim falsifiable: you can immediately see whether all five axes contributed or whether one dominated.

sycophancy_score(traj.evidence_delta) aggregates the per-step epistemic drift into a single per-agent score after deliberation:

traj = await plugin.deliberate(rnd, steps=3)
scores = sycophancy_score(traj.evidence_delta)
pressured = [aid for aid, s in scores.items() if s < -0.01]
# agents that moved toward less-confident peers (social pressure, not persuasion)

Positive = persuasion-driven (moved toward more epistemically confident peers); negative = pressure-driven (moved toward peers with lower confidence). The signal is the signed, trust-weighted confidence gap between the neighbours an agent is pulled toward and itself — genuinely peer-relative, not just the agent's own confidence trend. Operationalises Agarwal & Khanna 2025 (arXiv:2504.00374), "When Persuasion Overrides Truth in Multi-Agent LLM Debates," as a live per-round diagnostic. Falsifiable: test_sycophancy_detects_pressure_vs_persuasion runs two deliberations with identical semantic content where only the peers' certainty differs, and asserts the minority's sycophancy score flips sign — positive when pulled toward more-confident peers, negative toward less-confident ones.

pentadic_summary() shows BOTH the fixed commit weights AND the learned Layer-3 weights (adaptive_axis_weights) alongside per-axis scores, so researchers can observe the slow learned-weight evolution across simulation epochs while seeing exactly which (fixed) weights the commit used:

╔══════════════════════════════════════════════════════╗
║       ResonanceBFT — Pentadic Alignment Report       ║
...
║  Axis weights: sem=0.24  aff=0.22  rel=0.27  epi=0.14  beh=0.13  ║
╚══════════════════════════════════════════════════════╝

Appendix C. TrustStore snapshot and restore

snap = plugin._store.snapshot()
traj_a = await plugin.deliberate(rnd, steps=3, epsilon=0.05)

plugin._store.restore(snap)  # rewind all memory
traj_b = await plugin.deliberate(rnd, steps=3, epsilon=0.30)
# compare trajectories under different epsilon — same participants, rewound state

All memory layers are deep-copied: reputation, scalar trust, per-axis trust, co-commit ledger, behavioral counters, semantic history. The ledger's tuple[str,str] keys are serialised as "a,b" strings for JSON-safety and reconstructed on restore.

Appendix D. Three-timescale adaptive weights

Static numeric constants work for a single run but drift over long simulations as the agent population or task distribution shifts. ResonanceBFT organises twelve parameters across three learning timescales so the protocol self-calibrates while preserving all BFT safety invariants:

Layer 3 (slow, every 200 rounds)
  └─ axis_weights: Exponentiated Gradient from ConsensusType quality signal
       ↓ feeds as prior into Layer 2

Layer 2 (medium, every 50 rounds)
  └─ base_epsilon:  (a) rises on deadlock/polarized; falls on coalitional/capitulated
                    (b) also rises when median pairwise similarity < 0.50 (agents not converging)
  └─ threshold:     rises on false-quorum signal; clamped to BFT safety lower bound

Layer 1 (fast, every round)
  └─ trust_decay[i→j]:  silence penalty + strength buffer (Arena et al. 2023)
  └─ trust_gain[i→j]:   (1−decay) × 0.7 × freq_scale  (stability-guaranteed)
  └─ trust_loss[i→j]:   gain × 2.25  (λ ratio; Bleichrodt & L'Haridon 2023)
  └─ epsilon[i→j][axis]: sigmoid co-commit boost × per-axis multiplier

Scope of the learned weights (stated precisely). The Layer-3 axis_weights are updated by the Exponentiated Gradient from the ConsensusType quality signal and are surfaced for observability as adaptive_axis_weights (shown as "Learned weights (L3)" by pentadic_summary(), distinct from the fixed commit weights). They are load-bearing in L2 deliberation: once the slow learner drifts them from the seed, deliberate() scales each axis's convergence step by learned/seed, so the group pulls harder on the axes that have historically predicted genuine consensus (verified by test_learned_axis_weights_are_load_bearing_in_deliberation). This is carefully bounded — it is a no-op at the seed weights (so short runs are byte-identical and the effect only emerges over long, warmed-up simulations), and it is deliberately not read by the L1 commit (which uses fixed _AXIS_WEIGHTS so the certificate stays resolver-independent) nor by the trust-free auto-deliberation pass (so the committed diagnostics stay resolver-independent).

BFT safety is preserved by construction. Learned parameters shape who deliberates and how fast positions converge, but never alter the commit certificate size n−f — and, per the BFT-agreement section above, the commit gate never even reads the learned threshold/axis-weights (it uses fixed protocol constants). The adaptive threshold therefore only ever influences deliberation; as defence-in-depth it is still hard-clamped to the BFT safety lower bound so even that deliberation signal can never drift below what n−f agreement can tolerate:

def clamp_threshold(self, n: int, f: int) -> None:
    lb = max(0.0, 1.0 - 2.0 * f / (n - f))   # Cambus et al. 2025
    if self.threshold < lb:
        self.threshold = lb

Stability invariant for per-dyad gain (den Boer et al. 2024):

gain = available * 0.7 * freq_scale    # available = 1 − decay
assert gain < (1 - decay)              # always holds by construction
loss = gain * 2.25                     # λ ratio fixed by meta-analysis

Static _TRUST_GAIN = 0.18 violated this (0.18 > 1 − 0.92 = 0.08). The adaptive version enforces the stability condition algebraically so trust dynamics cannot diverge over long runs.

Sigmoid ε boost replaces the linear-capped 0.05 × min(co_commits, 5):

boost = 0.25 / (1 + exp(-0.35 × (co_commits - 5)))   # Gompertz-shaped
eps = (base_epsilon + boost) × AXIS_EPSILON_MULTIPLIERS[axis]

Established pairs get a smooth, saturating neighbourhood expansion rather than an abrupt cap. Per-axis multipliers (affective=1.8, behavioral=1.3, relational=1.2, semantic=1.0, epistemic=0.9) encode the social-psychology finding that affective states update faster than cognitive beliefs (Li, Luo & Chu 2025).

Appendix E. Edge cases handled explicitly

Byzantine defense-in-depth matrix: five attacks each met by a specific defense and guarantee — tamper→seal, collude→trimmed mean, newcomer-capture→reputation dampening, equivocation→conflict certificate, overwhelm→n-f abort

Case Behavior
n < 4 (BFT impossible) resolve() returns status=aborted, reason=insufficient_participants
Zero participants resolve() returns status=aborted, reason=no_evaluations
threshold ≤ 0 __init__ raises ValueError immediately
threshold > 1 valid — always-abort configuration; cosine ≤ 1 so no agent qualifies
Zero-magnitude embedding _cosine(zeros, zeros) = 0.0 via or 1.0 norm guard
Empty text _embed("") → all-zeros (no tokenisation, no error)
Byzantine at exactly f limit quorum intersection safety: `
Network partition (minority side) n = fixed membership (not received-message count), so a 4-of-7 partition still needs quorum 5 and cannot commit alone — no split-brain
Late vocab extension by a peer the vocab is append-only, so prior agents are NOT re-embedded — their shorter semantic vector stays a prefix of the longer one and _cosine zero-pads, so their sealed commitment stays valid and they are never falsely flagged tampered

Appendix F. Invariants verified by tests

# BFT quorum intersection safety
|Q1 ∩ Q2| ≥ (n−f) + (n−f) − n = n−2f > f   iff n ≥ 3f+1   ✓
# verified by: test_bft_quorum_intersection_safety (Hypothesis, n∈[4,12], f∈[1,3])

# Adaptive simplex
sum(axis_weights.values()) ≈ 1.0   and   all(w ≥ 0.05)
# verified by: TestAdaptiveLayerProperties::test_axis_weights_always_on_simplex (80 examples)

# Dyad trust stability
gain < (1 − decay)   for any co_commit_count
# verified by: TestAdaptiveLayerProperties::test_stability_invariant_for_any_co_commits (100 examples)

# ε bounds
0.05 ≤ get_epsilon(...) ≤ 0.50   for any axis, any co_commit_count
# verified by: TestAdaptiveLayerProperties::test_epsilon_always_in_bounds (80 examples)

# BFT safety lower bound
threshold ≥ max(0, 1 − 2f/(n−f))   (clamped after every Layer-2 update)
# verified by: TestAdaptiveLayerProperties::test_bft_lower_bound_formula (50 examples)

# Gradient direction
good rounds reinforce specialisation; bad rounds push toward uniform
# verified by: test_good_rounds_reinforce_current_distribution / test_bad_rounds_push_toward_uniform

# Partition quorum safety (split-brain prevention)
quorum_needed = N − f with N = fixed membership, not len(evaluations received)
# verified by: TestPartitionQuorumSafety (minority aborts; heal commits)

# Threshold safety clamp actually fires
clamp_threshold(n, f) is invoked by record_round_outcome after each Layer-2 update
# verified by: test_record_round_outcome_clamps_threshold_after_layer2

# Late vocab extension does not frame honest agents
# verified by: test_late_vocab_extension_does_not_false_flag_earlier_agents

Appendix G. Related systems & where ResonanceBFT is novel

A 2026-06-30 literature survey (two independent deep-research passes, then per-paper
verification against arXiv and the authors' publication lists) found no prior system that
combines all four
of: embedding/vector-valued opinions · temporal trajectory tracking ·
a Byzantine-fault-tolerant commit · a genuine-vs-superficial consensus-quality audit. The
nearest systems each cover only part:

System Opinion repr. Temporal Consensus rule Quality audit BFT
Aegean (Ruan et al. 2025, arXiv:2512.20184) answer strings per-round only incremental quorum safety+liveness
EVINCE (Chang 2024, arXiv:2408.14575) response distributions round entropy entropy/Wasserstein phase-switch info-theoretic proxy
ConsensusLLM (Chen et al. 2023, arXiv:2310.20151) scalar value rounds averaging
ResonanceBFT (this PR) 5-axis vector trajectory (velocity/evidence/stability) BFT n−f over pentadic similarity 8-type taxonomy + sycophancy + independence/capitulation/collapse n ≥ 3f+1

Our defensible contribution: BFT-safe consensus over multi-axis opinion vectors with a
trajectory-shape genuine-vs-superficial audit
— using trajectory features (per-axis velocity,
peer-relative evidence-delta, position stability) to tell genuine persuasion from capitulation,
as a protocol-level signal alongside a resolver-independent n−f commit. The consensus-quality
metrics reuse verified prior art: independence rate (Weng et al. 2025, BenchForm), confidence-
weighted persuasion-without-evidence (Agarwal & Khanna 2025, CW-POR), and disagreement collapse
(Yao et al. 2025). The five-axis vector lives in the free real-valued opinion regime where
per-dimension consensus thresholds are close to the 1-D case (Fortunato et al. 2005); our lower
5-D threshold (0.60) is an empirical calibration, not a claim that higher dimensionality
intrinsically eases consensus.

The semantic axis defaults to bag-of-words, but the plugin exposes an implemented, tested
embed_fn injection point: ResonanceBFT(..., embed_fn=encoder) swaps the semantic axis for any
dense encoder (e.g. a sentence-transformer, or Chain-of-Thought-trace embeddings, Gatto et al.
2023). Because each agent's semantic vector is sealed at participate() and never recomputed
at commit
, this does not touch the resolver-independent commit path — the embedding is
fixed once and read from sealed metadata, so cross-node model nondeterminism cannot affect
agreement (pinned by TestEmbedFn). Using a dense fixed-dim axis also removes the
heterogeneous-vocab and semantic-participation-order edge cases. The default stays bag-of-words
so the plugin is drop-in with no heavy ML dependency; bundling a specific pretrained model, and
a state-space temporal model, remain documented future work.

Appendix H. Academic context

Every citation maps to a real, published paper with a verifiable link, each confirmed
in a focused 2026-06-30 verification pass (web + arXiv + the authors' own publication lists)
that also corrected several ids an automated survey had gotten wrong — the corrections are
noted inline in
coordination/resonance_bft/REFERENCES.md
(arXiv / DOI for each). Code comments use short author-year tags; REFERENCES.md is authoritative.
Key verified anchors: Cambus et al. 2025 (arXiv:2504.01504), Li-Luo-Porter 2024
(arXiv:2303.07563), Li-Luo-Chu 2025 (arXiv:2502.00284), Thompsky-Wu-Porter-Luo 2026
(arXiv:2605.20418), Agarwal & Khanna 2025 (arXiv:2504.00374), Alpos-Cachin et al.
(arXiv:1906.09314), Hegselmann-Krause 2002 (JASSS), Russell 1980 (circumplex). The Layer-3
tanh-saturation emphasis is grounded in Kivinen & Warmuth 1997 (EG), Aitchison 1982 (log-ratio
simplex geometry), Brooks-Chodrow-Porter 2024 (Sigmoidal BCM), Sampson-Restrepo-Porter 2025
(tanh influence kernel), and Latané 1981 (sublinear social impact) — framed as an engineering
choice consistent with, not a verbatim implementation of, these.

Core BFT foundations

Dimensionality domination fix (novel)

  • Cheng et al. (2024). Multidimensional opinion dynamics with heterogeneous bounded confidence. Automatica (arXiv:2412.02710). — proves d-dimensional HK convergence with per-axis ε; justifies per-axis threshold design.
  • Prior art: no BFT protocol uses per-axis weighted similarity for quorum; nearest is multi-issue consensus reaching (Li et al. 2022) but without Byzantine safety.

Affective axis

  • Russell (1980). A circumplex model of affect. JPSP, 39(6). — the (valence, arousal) _affective axis. ResonanceBFT treats affective convergence as a quorum dimension, not merely a UI signal.

Commitment scheme (novel)

  • Sealing the immutable belief axes (not combined) is an original design choice with no direct prior art. The insight: deliberation legitimately updates positions; sycophancy lives in belief revision, not position update. All five belief axes are sealed (and the signature binds the commitment) so every axis the quorum weights is tamper-evident.

Evidence delta (Agarwal & Khanna 2025)

  • Agarwal & Khanna (2025). "When Persuasion Overrides Truth in Multi-Agent LLM Debates," arXiv:2504.00374 — agents converge on a persuasive but wrong answer; operationalised in evidence_delta per-step tracking. Positive = persuasion; negative = pressure.

Coalitional trajectory (Leifeld & Brandenberger 2024)

  • Leifeld & Brandenberger (2024). Endogenous coalition formation (arXiv:1904.05327). — bonding mechanism. ResonanceBFT uses median (not min) co-commit count so a single stranger pair doesn't veto established alliances.

Per-dyad adaptive ε (adaptive bounded-confidence)

  • Thompsky, Wu, Porter & Luo (2026) (arXiv:2605.20418) — adaptive edge weights that govern interaction probability and rise after positive (co-commit-like) interactions. ResonanceBFT adapts the same pair-history signal into the ε radius (how far I move toward you) rather than the interaction probability. Verified 2026-06-30 (arXiv + M. A. Porter's publication list). Also grounded in the adaptive-bound line Li, Luo & Porter (2024) (arXiv:2303.07563) and Li, Luo & Chu (2025) (arXiv:2502.00284).

Layer-3 emphasis: smooth, bounded, log-symmetric saturation

  • The learned per-axis weights (Exponentiated Gradient, Kivinen & Warmuth 1997) modulate deliberation via exp(A·tanh(k·ln(w_learned/w_seed))) — a tanh in log-ratio space, the natural geometry for simplex weights (Aitchison 1982). Using a smooth bounded influence rather than a hard cutoff follows the Sigmoidal Bounded-Confidence Model (Brooks, Chodrow & Porter 2024, arXiv:2209.07004) and the tanh-kernel opinion model of Sampson, Restrepo & Porter 2025 (Phys. Rev. E 112:024303, arXiv:2408.13336); the saturating (diminishing-returns) character matches the sublinear social-impact law (Latané 1981). Presented as an engineering choice consistent with — not a verbatim implementation of — these results.

Trust decay

  • Arena, Mulder & Leenders (2023). How fast do we forget our past social interactions? Network Science 11(2). — exponential decay best-fits social-interaction memory (≈64-day half-life); grounds the _TRUST_DECAY (trust *= 0.92) exponential-forgetting choice.

Logrolling / cross-axis Offers

Asymmetric weighting (local variant)

  • Cachin et al. (2019/2024). Asymmetric Distributed Trust. arXiv:1906.09314. — binary quorum membership. ResonanceBFT: continuous rep × outbound_trust product, locally observable.

Adaptive weights (new)

Full annotated bibliography — every entry with its exact usage in the code, arXiv/DOI links, and the correction history (including the ids an automated survey got wrong): REFERENCES.md.

@srcJin srcJin changed the title [Hackathon] srcJin: ResonanceBFT — pentadic BFT consensus for social agent networks [Hackathon] srcJin: Problem #10 — ResonanceBFT, pentadic partition-tolerant BFT consensus for social agent networks Jul 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant