[Hackathon] srcJin: Problem #10 — ResonanceBFT, pentadic partition-tolerant BFT consensus for social agent networks#58
Conversation
…orks Problem projnanda#10 (partition-tolerant BFT consensus). Agents agree on a five-axis belief vector (semantic / affective / relational / epistemic / behavioral) rather than a single value, with a strict layer separation: - L1 safety core: deterministic n-f quorum commit over sealed evaluations. Every record seals all five belief axes plus the BoW vocabulary basis (SHA-256 commitment) and carries a round- and identity-bound ed25519 signature; tampered records are excluded before they can touch the centroid. The commit certificate is a pure function of shared sealed state: fixed threshold, fixed axis weights, a non-lowerable membership floor, and a coordinate-wise trimmed mean trimmed by the configured fault bound f (box-validity MBAA at n >= 3f+1). - L2 audit: consensus-quality diagnostics (genuine vs sycophantic agreement, evidence-delta, trajectory classification, stance-aware false-agreement detection) that never alter the L1 commit. - L3 adaptation: three-timescale learning (per-round dyadic trust, per-epoch threshold/epsilon, slow axis weights via exponentiated gradient), structurally barred from the commit gate. Ships with adversarial validators (equivocation certificates, forged-quorum and fork detection), a ScenarioRunner scenario driving the real protocol over the transport with silent / Byzantine / partition / divergent-vocab fault modes, runnable examples with real-LLM town evidence, an annotated and verified bibliography, and 18 self-contained architecture diagrams. 321 plugin tests (unit + Hypothesis properties + adversarial + e2e; a live-marked real-LLM test excluded from CI). ruff + pyright clean.
AppendixReference material and the finer detail, referenced from the body above. None of it is required to grasp the thesis; all of it is here so the claims are checkable. Appendix A. Package structureThe five Appendix B. Observability reportTwo analysis helpers ship alongside the protocol — not just for debugging but as first-class analytical instruments that a cognitive-social-computing researcher would actually reach for after every run:
outcome = await plugin.resolve(rnd)
print(pentadic_summary(outcome.metadata))
# ╔══════════════════════════════════════════════════════╗
# ║ ResonanceBFT — Pentadic Alignment Report ║
# ╠══════════════════════════════════════════════════════╣
# ║ Status: committed Quorum: 7/5 Threshold: 0.60 ║
# ╠══════════════════════════════════════════════════════╣
# ║ Agent Sem Aff Rel Epi Beh Overall Align ║
# ║ a0 0.912 0.834 0.791 0.703 0.881 0.8271 [✓] strong ║
# ║ a2 (spy) 0.211 0.190 0.133 0.091 0.072 0.1468 [ ] weak ║Each agent's per-axis similarity against the (unweighted) commit centroid is shown alongside a qualitative alignment band (strong / moderate / borderline / weak). This makes the five-axis claim falsifiable: you can immediately see whether all five axes contributed or whether one dominated.
traj = await plugin.deliberate(rnd, steps=3)
scores = sycophancy_score(traj.evidence_delta)
pressured = [aid for aid, s in scores.items() if s < -0.01]
# agents that moved toward less-confident peers (social pressure, not persuasion)Positive = persuasion-driven (moved toward more epistemically confident peers); negative = pressure-driven (moved toward peers with lower confidence). The signal is the signed, trust-weighted confidence gap between the neighbours an agent is pulled toward and itself — genuinely peer-relative, not just the agent's own confidence trend. Operationalises Agarwal & Khanna 2025 (arXiv:2504.00374), "When Persuasion Overrides Truth in Multi-Agent LLM Debates," as a live per-round diagnostic. Falsifiable:
Appendix C. TrustStore snapshot and restoresnap = plugin._store.snapshot()
traj_a = await plugin.deliberate(rnd, steps=3, epsilon=0.05)
plugin._store.restore(snap) # rewind all memory
traj_b = await plugin.deliberate(rnd, steps=3, epsilon=0.30)
# compare trajectories under different epsilon — same participants, rewound stateAll memory layers are deep-copied: reputation, scalar trust, per-axis trust, co-commit ledger, behavioral counters, semantic history. The ledger's Appendix D. Three-timescale adaptive weightsStatic numeric constants work for a single run but drift over long simulations as the agent population or task distribution shifts. ResonanceBFT organises twelve parameters across three learning timescales so the protocol self-calibrates while preserving all BFT safety invariants: Scope of the learned weights (stated precisely). The Layer-3 BFT safety is preserved by construction. Learned parameters shape who deliberates and how fast positions converge, but never alter the commit certificate size def clamp_threshold(self, n: int, f: int) -> None:
lb = max(0.0, 1.0 - 2.0 * f / (n - f)) # Cambus et al. 2025
if self.threshold < lb:
self.threshold = lbStability invariant for per-dyad gain (den Boer et al. 2024): gain = available * 0.7 * freq_scale # available = 1 − decay
assert gain < (1 - decay) # always holds by construction
loss = gain * 2.25 # λ ratio fixed by meta-analysisStatic Sigmoid ε boost replaces the linear-capped boost = 0.25 / (1 + exp(-0.35 × (co_commits - 5))) # Gompertz-shaped
eps = (base_epsilon + boost) × AXIS_EPSILON_MULTIPLIERS[axis]Established pairs get a smooth, saturating neighbourhood expansion rather than an abrupt cap. Per-axis multipliers ( Appendix E. Edge cases handled explicitly
Appendix F. Invariants verified by testsAppendix G. Related systems & where ResonanceBFT is novelA 2026-06-30 literature survey (two independent deep-research passes, then per-paper
Our defensible contribution: BFT-safe consensus over multi-axis opinion vectors with a The semantic axis defaults to bag-of-words, but the plugin exposes an implemented, tested Appendix H. Academic context
Core BFT foundations
Dimensionality domination fix (novel)
Affective axis
Commitment scheme (novel)
Evidence delta (Agarwal & Khanna 2025)
Coalitional trajectory (Leifeld & Brandenberger 2024)
Per-dyad adaptive ε (adaptive bounded-confidence)
Layer-3 emphasis: smooth, bounded, log-symmetric saturation
Trust decay
Logrolling / cross-axis Offers
Asymmetric weighting (local variant)
Adaptive weights (new)
Full annotated bibliography — every entry with its exact usage in the code, arXiv/DOI links, and the correction history (including the ids an automated survey got wrong): |
ResonanceBFT — Pentadic BFT Consensus for Social Agent Networks
Layer:
coordinationPlugin name:
resonance_bftProblem: #10 — Partition-tolerant BFT consensus
Persona: cognitive-social-computing researcher — someone who came to distributed consensus from multi-agent social simulation, not from distributed databases. The question "did we agree?" is not a bit-level question when the agents are social and goal-directed; it is simultaneously semantic, affective, relational, epistemic, and behavioral.
Contents — outline & navigation
TL;DR · Claims → evidence · Motivation · Architecture · How consensus is reached
Quorum = n − f · Weighted pentadic quorum · Sealed commitments · Resolver-independent commit + box-validity
Deliberation (Hegselmann-Krause) · Trajectory classification · Time-decayed dyadic trust · Byzantine centroid dampening · Cold-start · Sybil guard & view-change · Adaptation, observability & replay
Tradeoffs · End-to-end through the town runner · Real-LLM town evidence · Verification · Safety proof sketch
A · Package structure · B · Observability report · C · snapshot / restore · D · Three-timescale weights · E · Edge cases · F · Invariants · G · Related systems · H · Academic context
Overview
TL;DR
ResonanceBFT is a partition-tolerant BFT consensus plugin for NANDA Town (problem #10). Agents agree not on a single value but on a five-axis belief vector (semantic · affective · relational · epistemic · behavioral). A deterministic
n − fcommit core (L1) owns safety; a social-science audit (L2) and three-timescale adaptation (L3) sit on top but never alter the commit. LLM nondeterminism is sealed atparticipate(), so the commit is a resolver-independent function of the sealed vectors.Novel: agreement over a value space with box-validity MBAA (deliberately cheaper than convex validity —
n ≥ 3f+1vs7f+1), cryptographic tamper-evidence + transferable equivocation certificates, and a genuine-vs-superficial consensus audit — classic BFT safety plus social interpretation, with L1 unweakened.Verified:
ruffclean ·pyright0 errors · 321 plugin tests (unit + Hypothesis property + e2e; 1057 whole-repo) · plus real-LLM town evidence — 140 real ScenarioRunner runs across mock / claude-haiku-4-5 / gpt-5.5 / Gemini 3.5 Flash, scaling to n = 49 (commit quorum = n − f), the deterministic core model-agnostic.Claims → evidence
n−fquorums intersect in an honest nodetest_bft_quorum_intersection_safety(Hypothesis, n∈[4,12])test_commit_is_resolver_independent_under_divergent_trusttest_byzantine_extreme_cannot_push_aggregate_out_of_honest_box·test_resists_biased_minority_better_than_plain_meanTestPartitionQuorumSafety·test_metadata_cannot_lower_configured_membershiptampered=2)test_deliberation_contracts_opinion_diametertest_learned_axis_weights_are_load_bearing_in_deliberation(L2 only)tests/test_resonance_bft_e2e.py·EVIDENCE.md/EVIDENCE_LARGE.md(n=4→49, 3 models)Motivation
Classical BFT (PBFT, HotStuff, Tendermint) defines consensus as identical byte sequences. This is the right definition for a replicated ledger. It is the wrong definition for a social agent network where two agents might submit the same value for incompatible reasons, or where an agent that "agrees" but feels anxious about it will defect under the next adversarial pressure.
contract_net— the defaultcoordinationplugin — reduces agreement to "who bids lowest." It has no quorum logic, no view change, and no mechanism to detect whether agents are genuinely aligned or just strategically mimicking the leader's position (sycophancy).ResonanceBFT upgrades the question. It requires simultaneous convergence across five orthogonal axes:
BFT safety guarantees still hold:
n ≥ 3f+1,quorum_needed = n−f. What changes is what counts as a vote: a sealed commitment over the text-derived belief axes, with a weighted pentadic similarity for quorum classification that gives each axis its own fair weight.Architecture — how the pieces fit
The feature list below is not a flat pile; it is four layers over one representation, plus one load-bearing invariant. Classical BFT conflates who agrees (safety) with whether the agreement is genuine (quality — sycophancy, coercion, alliances all fake "agreement"). ResonanceBFT separates them and adds a learning loop:
Load-bearing invariant: L2 (authenticity) and L3 (adaptation) never alter the L1 commit certificate (
n−f). That is what lets the protocol add social-science interpretation and self-tuning without weakening any BFT safety guarantee. Every concept below maps to exactly one layer: five axes → L0; Byzantine dampening / cold-start / Sybil → robustness; sycophancy / evidence_delta / trajectory types → L2; three-timescale learning → L3; co-commit ledger → the memory L2 and L3 share. The module docstring (__init__.py) and the seams (resolve= L1,deliberate= L2,record_round_outcome= L3) state their layer in code.How consensus is reached — one agent, many agents, and the space
Three complementary views of the same convergence: one agent over time → many agents over time → many agents in the embedding space. This is the intuition for everything that follows; the mechanics are in the next two parts.
The commit certificate — L1 safety
This is the sacred core: how a quorum forms, how a vote is sealed, and why every honest node commits identically. Nothing here depends on trust, reputation, or learning.
Quorum formula:
quorum_needed = n − f(not2f+1)The standard formula
2f+1is only correct whenn = 3f+1exactly.For
n > 3f+1, two quorums of size2f+1can intersect in onlyfagents — all Byzantine.ResonanceBFT uses
quorum_needed = n − f, which gives the correct intersection for alln ≥ 3f+1:Verified by the
test_bft_quorum_intersection_safetyHypothesis property test.Weighted pentadic quorum (anti-dimensionality-domination)
A naive concatenation of all five axis vectors into one combined vector and taking the full-vector cosine gives the semantic axis (64 dims) ≈91% effective weight, rendering the other four axes nearly irrelevant.
ResonanceBFT computes, per axis, each agent's cosine similarity to that axis's centroid, then aggregates with explicit per-axis weights. At commit the centroid is an unweighted mean over non-tampered records (resolver-independent — see the resolver-independence section); the explicit
_AXIS_WEIGHTSare what give each axis its fair share despite the 64-vs-2 dimensionality gap:During
deliberate()(L2, not the commit), each axis centroid is additionally weighted by axis-specific trust (_axis_trust[me][other][axis]), so an agent trusted for semantic alignment pulls the semantic centroid more than one trusted only behaviorally. The L1 commit centroid deliberately drops this weighting to stay resolver-independent.Sealed commitments (tamper-evidence for the belief axes)
Two-layer integrity (
resolve()checks both; either failing ⇒tampered):sha256(belief ‖ nonce)over all five sealed belief axes — cheap tamper-evidence that catches any party editing a sealed belief without recomputing the seal. For the bag-of-words semantic axis the seal also binds the vocabulary basis (sha256(belief ‖ nonce ‖ vocab)): sinceresolve()reconciles semantics over each record'svocab, relabelling that coordinate basis would reinterpret the sealed semantic values without touching them — folding the basis into the commitment makes that relabelling break the seal → flagged tampered (test_vocab_tampering_is_detected).(eval_text ‖ commitment)— a real cryptographic authorship binding that a metadata-controlling adversary cannot forge without the agent's private key. EachResonanceBFTderives a deterministic signing key from its seed (RFC 8032 signatures are deterministic, so reproducibility holds), signs atparticipate(), andresolve()verifies. Binding the commitment (not just the text) means a swapped vector — which forces a new commitment to pass the seal — also breaks the signature, closing the vector-swap attack. The signer re-signs after a legitimate vocab-extension re-embed (which changes the commitment), so honest agents stay valid.Honest scope. Minting the public key ↔ agent identity binding (the PKI step) is delegated to the stack's
identitylayer (ed25519_rotating) — by the layer separation, a coordination plugin should consume that binding, not reimplement the PKI. And it does: when the identity layer suppliesround.metadata["identity_pubkeys"],resolve()rejects any record whose pubkey doesn't match the bound key for itsaid(so a metadata adversary cannot substitute a self-signed record). By default a record for an agent with no bound key is accepted (the layered default); a security-conscious deployment can setround.metadata["require_identity_binding"] = Truefor fail-closed mode, where any record lacking an identity binding is rejected. Absent the map entirely, the plugin still provides cryptographic authorship + tamper-evidence, and full anti-equivocation is identity-layer + this signature together. We state this rather than overclaim.The commitment seals all five belief axes the commit weights — semantic + affective + epistemic + behavioral +
relational_sealed(the participate-time relational, never thedeliberate()-mutated copy):This is deliberate: relational/epistemic/behavioral carry 0.55 of the pentadic quorum weight, so sealing only semantic+affective would let a metadata-controlling adversary rewrite those three axes to inflate or evict a quorum member with no tamper flag. Sealing all five closes that hole.
Why not seal the full combined vector?
deliberate()is legitimately allowed to update the combined position (HK bounded-confidence updating). Sealingcombinedwould flag every deliberating agent as a tamperer — anddeliberate()specifically overwritesrec["relational"]from its own trust view, which is exactly why the seal (and the commit) read the immutablerelational_sealedcopy instead. The sealed axes are where sycophancy actually occurs: an agent changing what it claims to believe or how certain it claims to be after seeing the leader's position.The tamper check recomputes the seal over all five axes (a malformed/incomplete record raises on reconstruction and is treated as tampered, not a crash):
validate_bft_no_equivocationre-derives the same five-axis seal during post-run validation.Equivocation → a transferable conflict certificate (the clean slice of accountability). Detection alone only tells you an agent lied. When an agent signs two distinct commitments for the same
(round_id, aid)— the classic "tell A one thing, B another" — the validator now also emits an equivocation conflict certificate: a self-contained bundle of the two conflicting signed records. Because each is an ed25519 signature over(eval_text ‖ commitment ‖ round_id ‖ aid), anyone — who was never present and trusts no one — can verify it with only the agent's public key (verify_equivocation_certificate), so "we suspect X" becomes "here is X's own signature convicting it, twice." This is deliberately the one Byzantine fault with clean, provable attribution; we implement it (following Sheng et al. 2021, BFT Protocol Forensics, arXiv:2010.06785) rather than pay for full Polygraph-style O(n²) accountability the small-cluster setting does not warrant. Statistical faults (a single consistent-but-dishonest belief) remain the domain of the outlier / trust / social-audit layers — we do not overclaim a cryptographic proof where none exists.BFT-agreement: the commit certificate is resolver-independent
A subtle BFT trap: if
resolve()derives the quorum from anything in the resolver's private state, two honest nodes with divergent histories could disagree — a violation of BFT agreement. The commit must be a pure function of shared replicated state. ResonanceBFT closes this on three paths — every input to the quorum gate is either shared round metadata or a fixed protocol constant; nothing resolver-local enters it:resolve()gates the quorum withself._commit_threshold(the configured threshold, identical cluster-wide) and scores with the fixed seed_AXIS_WEIGHTS— it never reads the learnedself._store.threshold(L2) orself._store.axis_weights(L3), which are resolver-local and diverge as each node learns. The adaptive values still shape deliberation and are surfaced as observability (adaptive_threshold/adaptive_axis_weightsin metadata), but they are structurally barred from the commit. This makes the headline invariant — "L2/L3 never alter the L1 commit certificate" — literally true in code rather than a slogan.tampered) but biased minority: a plain mean has breakdown point 0 — even one extreme honest-looking vector shifts it. We therefore use the coordinate-wise trimmed mean (Yin et al., ICML 2018, arXiv:1803.01498): per axis, drop the configured fault boundf = ⌊(n−1)/3⌋of largest and smallest values, then average. This raises the breakdown point toward the honest majority while staying deterministic — it sorts per-axis values, never agents, so the centroid is still a pure function of the sealed vectors and provably identical across honest resolvers. Trimming byf(not⌊(k−1)/3⌋of the arrived records) is what makes box validity hold at then−fquorum floor: up tofof thekcommitted records can be Byzantine, and box validity requirestrim ≥ (Byzantine present), so⌊(k−1)/3⌋would under-trim exactly there (n = 7, f = 2, k = 5:⌊(k−1)/3⌋ = 1 < 2leaves one biased extreme in the box)._trimmed_meancaps the trim at⌊(k−1)/2⌋so ≥1 value always survives (at the floork = 2f+1this leaves the coordinate median — still box-valid). For honest, well-aligned rounds the dropped values sit right next to the mean, so the trimmed and plain centroids are near-identical and the committed quorum is unchanged in practice (the suite's honest-round tests still pass); the trim only bites when a value is genuinely extreme — the biased-minority case (test_resists_biased_minority_better_than_plain_mean), and the quorum-floor case (test_box_validity_at_quorum_floor_trims_by_configured_f: two valid-but-biased records atn = 7, k = 5cannot move the committed centroid — a regression that fails under the old⌊(k−1)/3⌋trim). Excluding tampered records and trimming biased extremes together mean neither a forged vector nor a valid-but-skewed one can drag the consensus point.resolve()does not rebuild relational/epistemic/behavioral fromself._trust_matrix/self._past_semantics/self._get_behavior(resolver-local). It reads each agent's sealed axes from the shared metadata — relational specifically from the immutablerelational_sealed, never thedeliberate()-mutatedrec["relational"](that mutation was the live bug behind the earlier resolver-agreement gap).quorum_needed = n − fis only safe ifncannot be shrunk by an adversary who controls the shared round metadata.resolve()computesn = max(present, metadata["expected_n"], self._expected_n, len(roster))— the max over every source the resolver knows locally, including its own constructorexpected_n. So a Byzantine proposer can only ever raise the bar (a genuinely larger cluster), never lower it: a stripped or under-statedexpected_nin the metadata cannot make a 4-of-7 partition believe it is the whole cluster and commit alone. Verified bytest_metadata_cannot_lower_configured_membership(metadata claimsn=4, roster stripped, only 4 of 7 present → the resolver configured withexpected_n=7still requires quorum 5 and aborts, no split-brain) andtest_uses_constructor_expected_n_when_metadata_missing.So every honest resolver derives the identical per-agent similarity, applies the identical fixed threshold, and reaches the identical quorum. Verified by
test_commit_is_resolver_independent_under_divergent_trust, which diverges all three resolver-local inputs — private per-dyad trust, the adaptive L2/L3 threshold + axis-weights, and reputation — between two participant resolvers, runsdeliberate()on each with its own trust first, then asserts byte-identicalsimilarities,quorum_agents,status, and reported commit params. (Three earlier iterations silently masked the dependence: external empty-state resolvers; then diverging only trust while the commit still read adaptive weights; then a commit that still read reputation — all three fixed.)This is approximate agreement over a value space — and that is the correct model, stated precisely. ResonanceBFT agrees over a continuous five-axis value vector, not a discrete bit. The rigorous notion for consensus over vector values is Multidimensional Approximate Byzantine Agreement (MBAA) — honest outputs within ε of each other and inside the range of honest inputs (validity) — not exact agreement (Dolev et al., JACM 1986; Vaidya & Garg, PODC 2013, arXiv:1302.2543; Mendes–Herlihy–Vaidya–Garg, Distributed Computing 2015). Demanding exact agreement on a discrete value is the wrong model for a value space; MBAA is the right one, and it is not a weaker form of BFT — it is the established formalism for this domain.
Under MBAA, ResonanceBFT's guarantees are:
_trimmed_mean), each coordinate of which lies within the honest per-coordinate range ([f-th smallest, f-th largest]); because we trim by the configuredf≥ the Byzantine count present, a Byzantine minority cannot push that raw aggregate outside the honest value space. Verified bytest_byzantine_extreme_cannot_push_aggregate_out_of_honest_box. Precisely: the committed direction is this aggregate normalised (_trimmed_centroid = _normalise(_trimmed_mean)), and the commit gates on cosine similarity (direction only), so the box-valid raw aggregate is what bounds where the commit points — we do not claim the normalised vector's coordinates themselves stay in the box (normalisation can move them; box validity is a property of_trimmed_mean, asserted on that helper). This is box validity; a coordinate-wise aggregate provably cannot give convex validity in d ≥ 2 (Cambus & Melnyk, arXiv:2306.12741). We do not claim convex validity — and choosing box over convex is the better engineering point on both axes their paper proves: convex validity forces a ≥2d = 10worst-case centroid error and needsn ≥ (d+2)f+1 = 7f+1at d=5, whereas box validity gives2√5 ≈ 4.47at the standardn ≥ 3f+1. Convex validity via the SafeArea construction is available but costs half the fault tolerance for a worse centroid — so we consciously do not adopt it.test_divergent_quorum_views_do_not_fork_shared_core(two overlapping 6-of-7 views both commit; every shared agent classified identically; shared-core divergence < 0.05). The per-view winner is a representative of the agreeing quorum, not a globally-canonical value — which is exactly what MBAA prescribes.test_deliberation_contracts_opinion_diameterverifies the honest opinion diameter is monotonically non-increasing across deliberation steps and strictly converges (e.g. 1.61 → 1.34 over four steps) — the Dolev contraction realised. Deliberation need only converge; the committed trimmed-mean aggregate carries the box-validity guarantee. This directly targets the failure mode Berdoz, Rugli & Wattenhofer (2026, "Can AI Agents Agree?", arXiv:2603.01213) found dominant in LLM Byzantine consensus — loss of liveness / stalled convergence — which ResonanceBFT engineers against with then−fcommit plus this bounded-round convergence engine.So the precise claim is: box-validity MBAA over a five-axis value space, with ε-agreement across overlapping quorum views, at
n ≥ 3f+1— a named, proven correctness notion, honestly the correct (not over-claimed) one for consensus over values.Tradeoff (stated honestly): each agent's sealed relational reflects its participate-time view (uniform for the first participant), so the relational axis does limited discriminating work at commit — the price of resolver-independence. Semantic + affective + epistemic + behavioral carry the quorum decision. When no roster is provided the relational axis is a purely neutral constant (identical for every agent); scoring it with its 0.25 weight would add a flat +0.25 to every pentadic score and silently loosen the threshold, so in that case
resolve()drops relational from the commit and renormalises the four informative axes to sum to 1. The threshold then gates only on axes that actually carry information, and the "genuinely five-axis" claim stays honest — relational counts toward the commit only when a roster makes it discriminating (verified bytest_neutral_relational_does_not_inflate_pentadic).Byzantine centroid dampening now lives where it belongs — outside the L1 commit. Because the commit centroid is a trimmed mean over non-tampered records, BFT safety rests on the
n−fquorum gate plus the trim's breakdown bound (not on resolver-local centroid weighting). The reputation-gap dampening still operates in deliberation (whose voice pulls the group) and in outlier/tamper penalties across rounds — its full mechanism is in the Byzantine centroid dampening section below.Deliberation, authenticity & robustness — L2 / L3
Everything in this part sits on top of the commit and, by the load-bearing invariant, never changes it. It is what turns a bare quorum into a genuine-vs-superficial judgement — and what keeps the network robust to newcomers, Byzantine entrants, and long-horizon drift.
Deliberation: Hegselmann-Krause with per-axis trust and per-dyad ε
deliberate()runs bounded-confidence (HK 2002) position updates:Idempotency: a snapshot of pre-deliberation positions is stored in
round.metadata["_pre_deliberation_snapshot"]; callingdeliberate()twice restores the snapshot and re-runs, producing the same result.Trajectory classification (9 types)
After deliberation, the trajectory is classified into one of nine consensus types based on five signals — velocities, concession_symmetry, axis_deltas, evidence_delta, and coalition history:
genuinecapitulatedcoercedlogrolledcoalitionalfragilepolarizeddeadlockunknownEvidence delta (Agarwal & Khanna 2025, arXiv:2504.00374): tracks whether the epistemic component of each agent's position was pulled toward higher-confidence peers (persuasion) or lower-confidence peers (social pressure). Distinguishes
genuinefromcapitulatedeven when concession patterns look symmetric.Coalitional shortcut (Leifeld & Brandenberger 2024): uses median co-commit count across all pairs (not minimum), so a single stranger pair in a group of veterans doesn't veto the coalitional label.
Time-decayed dyadic trust with per-axis overlay
Per-axis trust (
_axis_trust[me][other][axis]) tracks faceted reliability: an agent can be trusted for semantic content but less so for behavioral follow-through. Used to weight per-axis centroids indeliberate()(L2); the L1 commit centroid inresolve()is unweighted, so this trust never enters the certificate.Non-obvious invariant: Byzantine centroid dampening (pre-detection)
This governs the deliberation centroid (
_asymmetric_weight), not the L1 commit — the commit centroid is an unweighted mean over non-tampered records (see BFT-agreement above), so commit safety rests on then−fquorum. Where reputation does legitimately shape influence — whose voice pulls the group duringdeliberate()— a new Byzantine agent has far lower influence than an established veteran before any tampering is detected. The dampening mechanism is the reputation gap, evaluated through the real local-only outbound weightingweight[j] = rep(j) × trust(me→j):A veteran who has earned reputation over many committed rounds has multiples of a newcomer's centroid weight, so a brand-new Byzantine entrant cannot capture the centroid before its tampering is detected and its reputation collapses. Verified by
test_byzantine_centroid_weight_dampened, which now exercises the real_asymmetric_weight(not a hand-rolled model) acrossn_honest ∈ [4,10].This is exactly why newcomers are not seeded to the network-median reputation: doing so would erase the gap and hand a stranger veteran-level pull. The obvious-looking "be nice to newcomers, start them at the median" alternative is a pre-detection Byzantine-amplification bug; we reject it deliberately (see cold-start below).
Cold-start: integrating a new agent without an isolation spiral (or a Byzantine hole)
A genuinely new agent arrives with no reputation, no dyadic trust, and no co-commit history. Left unaddressed, one early divergence triggers a destructive feedback loop: outlier → trust penalty → lower neighborhood inclusion → more divergence → permanent isolation. But the naive fix (inflate the newcomer's standing) re-opens the Byzantine hole above. ResonanceBFT threads this with two asymmetric, safety-preserving mechanisms:
_is_newcomer). For the first_NEWCOMER_GRACE_ROUNDS(3) outcomes an observer has seen a peer in, no local dyadic-trust penalty is applied for being an outlier — breaking the cascade at its source. The signal is the observer's ownpeer_encounterscounter, the only newcomer signal a distributed node actually maintains (a peer's own participation count lives in the peer's store, not yours). Reputation is never waived (it is the global, observable misbehavior signal), and tampering is never waived (deliberate attack ≠ innocent divergence).rep = _REPUTATION_INIT, preserving Byzantine centroid dampening. Voice is earned through committed rounds, not granted on arrival.Crucially, BFT safety is untouched: the grace period changes how fast private trust moves, never the commit certificate size (
quorum_needed = n − f). A newcomer — honest or Byzantine — is still one vote amongnand cannot force a commit. Verified bytest_newcomer_trust_protected_for_grace_rounds,test_persistent_outlier_loses_trust_after_grace(the penalty does fire once grace expires),test_tampering_penalised_even_during_grace, andtest_newcomer_reputation_stays_low_for_byzantine_dampening.Sybil guard & view-change
Sybil guard — idempotent
participate(). An agent that callsparticipate()twice cannot double-vote or overwrite its sealed commitment:The guard is enforced at the round-metadata level, so it works even if multiple coroutines race to call participate on the same round object.
View-change — round-robin leader rotation. On abort, the next proposer is chosen deterministically so any agent can verify the expected leader without coordination:
Carries
view_change: True+ new proposer in metadata.Honest scope of the view change. This is a simplified round-robin re-proposal on abort, not a full PBFT/HotStuff view-change protocol. Two honest properties hold: (a) a view change only follows an abort — a round that reaches an
n−fquorum commits, so it never triggers a new view, hence at most one view commits per decision; and (b) each re-proposal is a freshRoundwith a newround_id. The consequence, stated rather than hidden: because views carry distinctround_ids,validate_bft_no_conflicting_commits(which groups byround_id) enforces the no-fork property within a view, and cross-view safety rests on property (a) rather than on a committed-value carry-forward across views. A full view-change protocol that links views of one decision and forwards the highest committed certificate is deliberately out of scope for this reference plugin — we do not claim it.Adaptation, observability & replay (long-horizon detail → appendix)
Three capabilities round out the L2/L3 stack. All three are deliberately kept out of the L1 commit path and, in the short graded scenarios, are largely dormant — so their full detail lives in the appendix, with the essentials summarised here:
pentadic_summary()/sycophancy_score(). First-class analytical instruments (not just debug prints): one renders the per-axis alignment table from a resolved outcome, making the five-axis claim falsifiable; the other scores each agent's persuasion-vs-pressure drift after deliberation. → rendered report + falsifiability tests: Appendix B.TrustStore.snapshot()/restore(). Deep-copies every memory layer so a researcher can rewind and re-run deliberation under different parameters on identical state. → Appendix C.Tradeoffs, evidence & verification
The design is opinionated; this part states what was traded away, then shows the protocol actually committing — first through the town runner with scripted stances, then with real LLM opinions at scale — and how to reproduce all of it.
Tradeoffs
combined)combineddeliberate()legitimately updatescombined; sealing it flags every participant as tampered. The five belief axes carry the quorum weight, so all five are sealed; sycophancy lives in belief revision, not position updates._AXIS_WEIGHTS)trust[me→j]mean(trust[k→j])position_stability=1.0gives unearned epistemic authority; 0.5 is the neutral prior.deliberate()deliberate()idempotent; safe to call multiple times (e.g., with different step sizes) without corrupting the round state.quorum_needed = n−fquorum_needed = 2f+12f+1only guarantees honest intersection whenn = 3f+1exactly.n−fis correct for alln ≥ 3f+1.threshold = 0.60threshold = 0.90End-to-end: consensus actually driven through the town runner
Earlier revisions disclosed honestly that the framework's generic
consensusscenario is a toy leader/follower vote that ignores the coordination plugin — so no town run actually exercised ResonanceBFT; the protocol was covered only by the unit/property suite. That gap is now closed.nest_plugins_reference/scenarios/resonance_bft_consensus.pyregisters (via the publicregister_scenarioAPI — no core changes) a scenario whose agents drive the real protocol over the simulator's in-memory transport: the leaderpropose()s aRoundand seals its evaluation, each followerparticipate()s and returns a self-containedVotecarrying its full sealed record inVote.metadata(so a generic transport can driveresolve()from the votes alone, not only via the sharedRound.metadatathe single-process runner mutates —test_votes_are_self_contained_for_generic_transport), the leaderresolve()s then−fquorum andcommit()s, then broadcasts the committedOutcomeso every agent appliescommit()and adapts its own trust — genuine multi-agent L3 adaptation, not leader-only (verified in the e2e test by followerO|receipts). Each agent gets its ownResonanceBFTinstance from the runner-resolved class — exactly how the framework instantiates a coordination plugin. The default BoW semantic axis is transport-safe: each record carries the vocabulary it was embedded over andresolve()remaps every record onto the canonical union vocab, so followers that extended the vocab with different private words are still compared on aligned coordinates (test_transport_divergent_vocab_not_falsely_aligned).Running
scenarios/resonance_bft_consensus.yaml(12 agents × 6 rounds, dense embedding on the semantic axis) through the realScenarioRunnerproduces this trace — the protocol's own per-round output, not lifecycle events:false_agreementThe driver resolves at the n−f quorum (9 of 12), not at unanimity — it commits as soon as a quorum of sealed evaluations arrives, so slow or silent agents never block liveness. The split rounds are the interesting ones: agents that say "approve" and "reject" the same proposal are cosine-close on the semantic axis, so the quorum still forms (a false consensus by topic similarity) — but the antonym-anchored stance audit (
_polarity.py, grounded in the Linear Representation Hypothesis / SensePOLAR) flagsfalse_agreement ≈ 0.56, while genuine unanimous rounds read0.00. This is the embedding scheme and the stance audit running end-to-end in a live multi-round town simulation, not in a unit test. The audit is diagnostic only — it never gates then−fcommit certificate (the load-bearingL2/L3 never alter L1invariant), and because it reads only sealed vectors + a fixed direction it stays resolver-independent.Fault tolerance, demonstrated through the runner.
scenarios/resonance_bft_consensus_faulty.yaml(7 agents,silent: 2— two crashed/partitioned followers that never respond) still commits atquorum = 5/5= n−f from the 5 responders, proving then−ftolerance in a live run, not just the unit suite (test_resonance_bft_commits_at_quorum_despite_silent_agents). Conversely, under a genuine partitionscenarios/resonance_bft_consensus_partition.yaml(4/3 split, only 4 reachable,quorum_needed = 5) produces no commit while partitioned — liveness requires a sufficient quorum. All three are CI-verified bytests/test_resonance_bft_e2e.py, which runs them through the realScenarioRunnerand asserts on the consensus events in the trace.The same run with a real no-torch encoder swapped in (
embed: model2vec, or fastembed'sbge-smallvia ONNX) confirms the benchmark's prediction in the live town: the contextual encoder (fastembed) flags the split rounds identically (false_agreement = 0.55), while the static encoder (model2vec) commits the same rounds but reads0.0— static embeddings cannot separate stance (matching theopp_sep_rate1.00 vs 0.67 in BENCHMARKS.md). The mechanism, the benchmark, and the live multi-round simulation all agree.The recommended real encoder is therefore
fastembedbge-small — a contextual, attention-based transformer run via ONNX Runtime (no PyTorch). This distinction is load-bearing, not cosmetic:model2vecis a static embedding (distilled token vectors looked up and averaged — no attention at inference), which is precisely why it cannot separate stance; the attention-derived representation is what makes the polarity direction linear (Park et al. 2024) and the audit fire. For stance specifically, we measured the cost of staying single-vector against a no-torch ONNX NLI cross-encoder (Xenova/nli-deberta-v3-small, which reads both utterances jointly with cross-attention): the cheap linear probe ties it on explicit approve/reject (6/6 each), but the NLI model wins 6/6 vs 0/6 on hard stance — negation, implicit, and off-axis opinions the probe cannot reach. So the probe is the right resolver-independent in-protocol audit, and the NLI cross-encoder is the optional high-accuracy analyzer — both no-torch (examples/resonance_bft_embeddings/nli_vs_probe.py).Honest scope: this is a single-process, in-memory-transport drive (not multi-host networking), but it is the plugin genuinely reaching — and committing — consensus inside a town run, over the message bus, across many rounds.
Real-LLM town evidence: actual agents, real models, across four subscription backends
The scenarios above use scripted stances so they are deterministic and CI-runnable. To confirm the town commits with real model-generated opinions,
examples/llm_consensus/evidence_town.pydrives the same committed scenario YAMLs through the sameScenarioRunner, but each honest agent's opinion is produced by a real LLM via a key-free subscription CLI (Claude Codeclaude, Codexcodex, Antigravityagy), injected through a generic, default-off scenario hook (task.config["opinions"]; with no opinions supplied the scenario is byte-for-byte the shipped scripted behaviour, so this adds no runtime dependency and cannot affect plugin behaviour). It islive-marked and excluded from CI (-m "not live"), so it never gates the submission.A full matrix — 5 scenarios × 4 tiers × 3 reps = 60 real town runs, lowest/fastest models only (the deterministic core needs no frontier model) — is recorded in
examples/llm_consensus/EVIDENCE.md. The four tiers, with exact model ids (full config + metrics inexamples/llm_consensus/MODELS.md):mock(deterministic),claude:haiku= Anthropicclaude-haiku-4-5,codex= OpenAIgpt-5.5(model_reasoning_effort=low), andagy= Antigravity fronting Google Gemini 3.5 Flash (Low thinking tier) —agyis a CLI, not a model. Result: every scenario with quorum slack reached its designed outcome on every model and every rep — byzantine commits withtampered=2, partition never commits, bag-of-words commits — so the decision is model-agnostic: the LLM changes the opinions and the audit'sconsensus_type, never the commit rule. Real-LLM testing also surfaced an honest liveness nuance the scripted mock masks: at the exact quorum floor (silent-crash,present == n−f, zero slack) a single genuinely-divergent real opinion can prevent assembling a coherent quorum, so the round safely does not commit — correct safety (never commit an incoherent quorum), and a sensitivity that scenarios with even one seat of slack absorb every time. This is the "transformer-at-the-edge, deterministic core" thesis demonstrated on real models, not just asserted.A second, larger harness (
examples/llm_consensus/evidence_scale.py, report inEVIDENCE_LARGE.md, with a one-glance dashboard inEVIDENCE_LARGE.svg) enlarges four dimensions independently — 88 real-town runs (80 + 8 cross-model at n=13,25) — and sharpens each claim:n−fexactly at every size — e.g. Gemini 3.5 Flash commits 33/33 at n = 49, and claude-haiku-4-5 and gpt-5.5 both commit at n = 13 (9/9) and n = 25 (17/17) too, so scaling is confirmed across three independent models, not one. The deterministic core scales; only the opinions come from the model.consensus_typeshifts with how opinions cluster.present == n−ffloor (zero slack), across 10 reps × 3 models: agy 10/10, claude 9/10, codex 9/10 commit. The ~1-in-10 non-commit is the protocol safely refusing an incoherent quorum — the same nuance the small run surfaced, now with a rate.consensus_typewarms upfragile → genuineas the Layer-3 trust/weights adapt on real opinions — while the L1n−fcertificate stays untouched.The harness persists every run and supports
--resume, so this matrix completed across several interruptions with zero repeated work; it islive-marked and never runs in CI.Verification
The full edge-case matrix and the machine-checked invariant list are in Appendix E and Appendix F; the package layout is in Appendix A.
Run all tests:
Verify plugin is discoverable:
Run ResonanceBFT — full pipeline including deliberation:
Test that deliberation does NOT break the tamper check (the key commit correctness fix):
Test Byzantine agent is detected when it changes its semantic belief:
Run adversarial validators:
Use existing consensus scenario:
Hardening in this revision. Adversarial LLM-judge review (gpt-5.5-pro at high reasoning) surfaced three latent defects, each now fixed with a regression test:
resolve()derivednfrom received evaluations, letting a partitioned minority lower its own quorum bar.nis now the fixed cluster membership (max(present, expected_n), sourced fromall_agents,task.metadata["expected_participants"], or theexpected_nconstructor arg)._cosinezero-pads, so their commitment stays valid (only the current agent re-embeds its own text and re-signs).clamp_threshold()was defined but never called; it now runs after every Layer-2 threshold update, so the adaptive threshold can never erode below the BFT safety lower bound.Deliberation semantics were also clarified:
deliberate()drives trajectory classification and adaptive learning, while the commit certificate is intentionally fixed by the sealed belief axes (anti-sycophancy).Safety proof sketch (n=7, f=2)
quorum_needed = n−f = 55+5−7 = 3agentsf=2of those 3 can be Byzantine → ≥ 1 honest agent in both quorums → no conflicting commitsvalidate_bft_no_conflicting_commitsenforces this mechanically.validate_bft_no_equivocationenforces that no agent presents different sealed beliefs to different resolvers.