Add LangSec bencode validator with bounded grammar#3
Merged
Conversation
Introduces internal/bencode/Validate — a structural walker that bounds nesting depth, string length, list length, dict key count, and total input size before any permissive decoder allocates memory or recurses on the payload. Inspired by Sassaman & Patterson, "The Science of Insecurity": recognize completely against a bounded grammar, then execute. - internal/bencode/validate.go: pure structural validator that walks bencode bytes without allocating the decoded tree. Configurable Limits with three presets sized for their threat model: TorrentLimits (16 MiB, depth 64), PeerMessageLimits (1 MiB, depth 16), TrackerResponseLimits (2 MiB). - internal/bencode/validate_test.go: ~50 subtests covering happy paths, malformed bencode (empty, trailing data, leading-zero ints, declared-only oversize strings), and each constraint violation. - internal/torrent/parse.go: Validate against TorrentLimits before bencode.DecodeBytes. walkFileTree gains an explicit depth guard (maxFileTreeDepth = 64) as defense-in-depth. - internal/client/peer.go (BEP 10 extension handshake) and internal/client/metadata.go (BEP 9 metadata header): Validate against PeerMessageLimits before decoding — these come from attacker-controllable peer connections. - internal/client/tracker.go: Validate against TrackerResponseLimits before decoding the tracker reply. - internal/torrent/parse_test.go: new tests for malformed bencode and deeply-nested input rejection. Closes the unbounded-recursion vector in walkFileTree and kills the oversized-string-allocation vector in every bencode decode site.
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces
internal/bencode/Validate— a structural walker that bounds nesting depth, string length, list length, dict key count, and total input size before any permissive decoder allocates memory or recurses on the payload. Applied to every untrusted bencode decode site in the codebase.This is Priority 2 of 3 from the LangSec analysis. P1 (announce recognizer, #2) and P3 (peer wire-protocol state machine) are independent.
What's new
internal/bencode/validate.go— pure structural validator that walks bencode bytes without allocating the decoded tree. Three presetLimits:TorrentLimits(16 MiB, depth 64) for.torrentfilesPeerMessageLimits(1 MiB, depth 16) for BEP 9/10 peer messagesTrackerResponseLimits(2 MiB) for tracker repliesinternal/bencode/validate_test.go— ~50 subtests covering happy paths, malformed bencode (empty, trailing data, leading-zero ints, declared-only oversize strings), and each constraint violation.Where it's wired
internal/torrent/parse.go(.torrentdecode)TorrentLimitswalkFileTreerecursion + arbitrary-length string allocationinternal/client/peer.go(BEP 10 ext handshake)PeerMessageLimitsinternal/client/metadata.go(BEP 9 metadata header)PeerMessageLimitsinternal/client/tracker.go(tracker response)TrackerResponseLimitswalkFileTreealso gains an explicitmaxFileTreeDepth = 64guard as defense-in-depth — even if a future caller skipsValidate, the recursion can't blow the stack.Test plan
go test ./...— all packages green (bencode, torrent, client, tracker, wl)go vet ./...— cleaninternal/torrent/parse_test.go— empty input, trailing garbage, unterminated dict, non-bencode, deeply nested all rejectedLimitsare appropriately sized for real-world workloads (esp.TorrentLimits.MaxStrLen = 8 MiBfor very longpiecesstrings)wbencodeimport alias convention works for the project (avoids package-name collision withgithub.com/zeebo/bencode)🤖 Generated with Claude Code