Filter expression parser — closes #11#12
Merged
Conversation
Replaces the flat substring-split filter engine in PacketAnalysisWindowController with a proper tokenizer + recursive- descent parser + AST evaluator. Fixes the three bugs surfaced in issue #11's review of PR #10. The parser ========== `PacketFilter.compile(_:)` produces an evaluable expression. The AST is purely-functional + thread-safe; `compiledFilter` is cached on the analysis controller so per-packet evaluation doesn't re-parse. Grammar ------- expr ::= orExpr orExpr ::= andExpr ( "or" andExpr )* andExpr ::= notExpr ( "and" notExpr )* notExpr ::= "not" notExpr | atom atom ::= "(" expr ")" | predicate predicate ::= protocol | "port" op N | "length" op N | "ip.addr"/"ip.src"/"ip.dst" op IP | "info" "contains" "string" | identifier # legacy bare-protocol or substring | identifier N # sugar: "tcp 80" -> tcp AND port == 80 op ::= == | != | < | > | <= | >= Operator keywords (and/or/not/port/length/info/contains/ip.*) are reserved; quote them with "..." to match those words literally. Three review-bug regressions pinned in PacketFilterTests ======================================================== #1 Suspicious TLDs filter respects parens `dns and (info contains ".tk" or info contains ".ml" ...)` no longer falls into the " or " substring-split trap. HTTP traffic carrying "html" does NOT match the DNS-suspicious-TLDs filter anymore. #2 Non-Standard Ports filter is numeric `port != 80` is a real comparison. Port 8080 / 5300 / 4430 pass through the "not 80 / not 443 / not 53 / not 22" filter as the user actually intends. #3 Misleading presets honestly relabeled Eight presets promised semantics the engine can't reach without per-packet payload inspection (User-Agent matching, SYN-flag inspection, TLS record dissection). They've been split into two buckets: - Implemented via length predicates: "DNS Tunneling (Long Queries)" -> dns AND length > 100; "Large Outbound Transfers" -> tcp AND length > 1000; "ICMP with Payload" -> icmp AND length > 64. - Honestly renamed: "Non-Browser HTTP (curl/wget/python)" -> "HTTP (inspect for non-browser UA manually)". Same for the four TLS variants (collapsed to two entries), short-TCP beacons, base64-in-HTTP, and SYN-flood scanning. The showFilterInfo callout now prompts the analyst on what to inspect manually. Tests ===== PacketFilterTests: 17 new tests covering the three regression cases above plus boolean precedence, atoms (proto / port / length / IP / substring), `tcp N` sugar, whitespace + case-insensitivity, and malformed-input error surfacing. Full suite: 267 / 267 passing (up from 249), one pre-existing network-dependent test still skipped as before. CapturedPacket now conforms to `PacketLike` so the evaluator can run against the real packet stream without a runtime adapter; the protocol exists so unit tests can supply lightweight fixtures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the flat substring-split filter engine in
PacketAnalysisWindowControllerwith a real tokenizer + recursive-descent parser + AST evaluator. Closes the three pre-existing bugs filed in #11.The three review bugs, before/after
#1 Suspicious TLDs filter is wildly over-broad — `dns and (tk or ml or ga or cf or gq)`
#2 Non-Standard Ports excludes legitimate non-standard ports — `tcp and not 80 and not 443 and not 22 and not 53`
#3 Eight presets promise semantics the engine couldn't deliver
Grammar
```
expr ::= orExpr
orExpr ::= andExpr ( "or" andExpr )*
andExpr ::= notExpr ( "and" notExpr )*
notExpr ::= "not" notExpr | atom
atom ::= "(" expr ")" | predicate
predicate ::= protocol | "port" op N | "length" op N
| "ip.addr"/"ip.src"/"ip.dst" op IP
| "info" "contains" "string"
| identifier # legacy bare-protocol or substring
| identifier N # sugar: "tcp 80" → tcp AND port == 80
op ::= == | != | < | > | <= | >=
```
Files
Test plan
Closes #11
🤖 Generated with Claude Code