Skip to content

feat(p2p): add highlighted peer observability#2854

Open
julienh-ssv wants to merge 4 commits into
stagefrom
attack-sim-observability-clean
Open

feat(p2p): add highlighted peer observability#2854
julienh-ssv wants to merge 4 commits into
stagefrom
attack-sim-observability-clean

Conversation

@julienh-ssv
Copy link
Copy Markdown
Contributor

Summary

Adds targeted observability for attack-simulator traffic by letting an SSV node highlight one or more configured libp2p peers, including peers configured by their secp256k1 public key. The stage attack-simulator public key can now be passed through P2P_HIGHLIGHTED_PEERS, with an optional P2P_HIGHLIGHTED_PEER_LABEL such as attack-simulator.

The implementation wires a shared highlighted peer observer through P2P setup and message validation so we can follow traffic from the attack simulator across connection, stream, pubsub, peer score, wire validation, and SSV-level validation paths.

What changed

  • Added network/peers/peertrace to parse highlighted peer IDs/public keys and emit consistent logs and metrics.
  • Added config/env support for P2P_HIGHLIGHTED_PEERS and P2P_HIGHLIGHTED_PEER_LABEL.
  • Added highlighted peer logs for connection handling, stream handling, pubsub validation, delivered pubsub messages, pubsub trace events, and peer score inspection.
  • Added SSV-level validation observer events so highlighted messages report accepted/ignored/rejected decisions with reason, role, SSV message type, slot, duty executor ID, signers, and QBFT fields when available.
  • Added counters under ssv.p2p.highlighted_peer for events, pubsub validation outcomes, and SSV validation outcomes.
  • Expanded pubsub trace metadata for message counts/topics/IDs and IHAVE/IWANT/GRAFT/PRUNE control metadata, which helps identify barrage and mesh/control-plane attacks.
  • Documented the attack-simulator public key in config/config.example.yaml.

Why

When running attack-simulator against a stage SSV node, it was difficult to tell whether attack traffic reached the node, whether it passed wire-level pubsub validation, and how far it progressed through SSV message validation. Highlighting the simulator peer gives us a low-noise view of simulator-originated traffic and enough context to understand whether each attack is affecting gossip, scoring, stream handling, or SSV validation.

Validation

  • gopls diagnostics on edited Go files: clean
  • git diff --check: clean
  • ok github.com/ssvlabs/ssv/message/validation (cached)
    ok github.com/ssvlabs/ssv/network/peers/peertrace (cached)
    ok github.com/ssvlabs/ssv/network/topics (cached)
  • ok github.com/ssvlabs/ssv/cli/operator (cached) [no tests to run]
    ok github.com/ssvlabs/ssv/network/p2p (cached) [no tests to run]
  • go test ./network/streams ./network/peers/connections ./network/peers/peertrace ./message/validation -timeout 60s

Note: go_vulncheck ./... could not run locally because the scanner binary is built with Go 1.25 while this checkout requires Go 1.26. No dependency changes are included in this PR.

@julienh-ssv julienh-ssv requested review from a team as code owners May 18, 2026 06:31
@julienh-ssv julienh-ssv self-assigned this May 18, 2026
@julienh-ssv julienh-ssv requested a review from alok-ssv May 18, 2026 06:32
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 73.00771% with 105 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.9%. Comparing base (0fa3b4a) to head (c7bfd33).

Files with missing lines Patch % Lines
network/peers/peertrace/observer.go 77.7% 18 Missing and 14 partials ⚠️
message/validation/errors.go 14.8% 23 Missing ⚠️
network/topics/tracer.go 69.2% 20 Missing ⚠️
cli/operator/node.go 0.0% 9 Missing ⚠️
network/p2p/p2p.go 64.2% 3 Missing and 2 partials ⚠️
network/topics/controller.go 80.7% 5 Missing ⚠️
network/peers/connections/conn_gater.go 91.8% 3 Missing and 1 partial ⚠️
message/validation/options.go 0.0% 3 Missing ⚠️
network/topics/pubsub.go 25.0% 2 Missing and 1 partial ⚠️
message/validation/validation.go 0.0% 1 Missing ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR adds a targeted observability layer for "highlighted" libp2p peers — identified by peer ID or secp256k1 public key — so attack-simulator traffic can be followed across connection, stream, pubsub, scoring, wire-validation, and SSV-validation layers without noise from the rest of the mesh.

  • New peertrace package parses peer IDs/public keys from config, emits consistent zap log lines and three OTel counters (events, validations, ssv_validations), and exposes nil-safe methods used throughout the stack.
  • SSV validation observer hooks every outcome path in handleValidationError and the new handleValidationSuccess(ctx, peerID, …) signature, providing full accept/ignore/reject coverage with role, slot, signers, and QBFT fields.
  • Pubsub tracer and scoring now expose richer control-plane metadata (IHAVE/IWANT/GRAFT/PRUNE counts, message IDs/topics in RPCs) and activate the event tracer whenever a peer observer is configured, even if TraceLog is off.

Confidence Score: 3/5

Safe to merge for functionality, but scoring.go now builds per-peer log fields on every inspection cycle instead of only on log cycles, which could noticeably increase allocations on production nodes with many peers.

The observability wiring is correct end-to-end and the nil-safe Observer design prevents crashes. The one behavioural change worth addressing before merging is in scoring.go: the fields slice and the peerObserver.Observe call are now executed for every peer on every inspection cycle, not just on log-frequency cycles. On a busy node this is a meaningful increase in allocations that grows linearly with peer count and inversely with logFrequency.

network/topics/scoring.go — the log-field construction was moved before the logFrequency gate, causing per-cycle work for all peers.

Important Files Changed

Filename Overview
network/peers/peertrace/observer.go New package implementing the highlighted-peer observer; core logic is sound, but ObserveSSVValidation uses the global zap.L() instead of a passed logger, unlike the other two observe methods.
network/topics/scoring.go Moving fields construction before the logFrequency gate introduces per-peer per-cycle allocations for every connected peer, not just highlighted ones; the peerObserver.Observe call itself is O(1) but the field slice it receives is built unconditionally.
network/peers/connections/conn_handler.go Adds connection/disconnection/filter highlight observations; variadic peerObserver parameter silently discards extra arguments but is otherwise correct.
network/topics/tracer.go Wires peerObserver into the pubsub event tracer and conditionally gates debug logging on traceLog; the tracer is now activated whenever a peer observer is enabled, which is the intended behaviour.
network/topics/controller.go Wraps the topic validator to record pubsub-received counters and call ObserveValidation for highlighted peers; closure captures the topic name parameter (not a loop variable) so there is no variable-capture issue.
message/validation/errors.go All validation outcomes (timeout, cancel, non-Error, ignored Error, rejected Error, success) now emit an SSV validation observation; coverage looks complete.
network/p2p/p2p.go Observer is lazily built if not injected via Config.PeerObserver, and callers (node.go) pre-build and inject it; double-construction is guarded correctly.
network/streams/controller.go Stream request/response and oversized-payload events now emit highlight observations; nil observer is safe due to nil-receiver methods on Observer.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CFG["Config\nHighlightedPeers / HighlightedPeerLabel"] --> OBS["peertrace.Observer\n(New / parsePeer)"]
    OBS --> CONN["connections.ConnHandler\nconnected / disconnected / filtered"]
    OBS --> STREAM["streams.StreamController\nrequest / response / oversized"]
    OBS --> TRACER["topics.psTracer\npubsub trace events"]
    OBS --> SCORING["topics.scoreInspector\npubsub_peer_score"]
    OBS --> CTRL["topics.topicsCtrl\npubsub_message_delivered\nObserveValidation"]
    OBS --> VALIF["validation.messageValidator\nObserveSSVValidation"]
    CTRL -->|"wrappedValidator\nrecordPubsubMessageReceived"| CTRL2["pubsub topic validator\naccept / ignore / reject"]
    VALIF -->|"outcome + reason + QBFT fields"| EVT["SSVValidationEvent\n(accepted / ignored / rejected)"]
    OBS -->|"Observe / ObserveValidation\nObserveSSVValidation"| METRICS["OTel Counters\nhighlighted_peer.events\nhighlighted_peer.validations\nhighlighted_peer.ssv_validations"]
Loading

Reviews (1): Last reviewed commit: "feat(p2p): add highlighted peer observab..." | Re-trigger Greptile

Comment thread network/topics/scoring.go
Comment thread network/peers/peertrace/observer.go
Comment thread network/peers/connections/conn_handler.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant