Skip to content

Filter expression parser — closes #11#12

Merged
DaxxSec merged 1 commit into
mainfrom
feat/filter-expression-parser
May 13, 2026
Merged

Filter expression parser — closes #11#12
DaxxSec merged 1 commit into
mainfrom
feat/filter-expression-parser

Conversation

@DaxxSec
Copy link
Copy Markdown
Owner

@DaxxSec DaxxSec commented May 13, 2026

Summary

Replaces the flat substring-split filter engine in PacketAnalysisWindowController with a real tokenizer + recursive-descent parser + AST evaluator. Closes the three pre-existing bugs filed in #11.

The three review bugs, before/after

#1 Suspicious TLDs filter is wildly over-broad — `dns and (tk or ml or ga or cf or gq)`

  • Before: " or " split fired first, fell through to substring matching against the info field. "ml" matched every packet whose info contains "html", "xml", or "smtp". A DFIR analyst trusting the label was being misled at the worst possible moment.
  • After: `dns and (info contains ".tk" or info contains ".ml" or …)` — parens are real, leading-`.` substrings only match actual TLD references. Test `testSuspiciousTLDsFilterRespectsParens` pins this against HTML traffic.

#2 Non-Standard Ports excludes legitimate non-standard ports — `tcp and not 80 and not 443 and not 22 and not 53`

  • Before: "not 80" became `!info.contains("80")`. Port 8080 → contains "80" → excluded. Port 5300 → contains "53" → excluded. The filter that was supposed to surface non-standard ports actually hid them.
  • After: `tcp and port != 80 and port != 443 and port != 22 and port != 53` — numeric comparison on the parsed port. Test `testNonStandardPortsExcludesByNumericComparisonNotSubstring` verifies 8080 / 5300 / 4430 pass through correctly.

#3 Eight presets promise semantics the engine couldn't deliver

  • Implemented via length predicates (the parser can now express these): "DNS Tunneling (Long Queries)" → `dns and length > 100`, "Large Outbound Transfers" → `tcp and length > 1000`, "ICMP with Payload (Covert Channel)" → `icmp and length > 64`.
  • Honestly renamed (require payload / header inspection the parser can't reach): "Non-Browser HTTP (curl/wget/python)" → "HTTP (inspect for non-browser UA manually)"; the four TLS variants collapsed to two ("All TLS / SSL" + an "(inspect handshake/SNI/certs manually)" prompt); "Short TCP Connections (Beacon)" → "TCP (inspect for short-connection beacons manually)"; "Base64 in HTTP" → "HTTP (inspect for base64 payloads manually)"; "Port Scanning (SYN Flood)" → "TCP (inspect for SYN-flood patterns manually)". `showFilterInfo` now prompts the analyst on what to inspect by hand.

Grammar

```
expr ::= orExpr
orExpr ::= andExpr ( "or" andExpr )*
andExpr ::= notExpr ( "and" notExpr )*
notExpr ::= "not" notExpr | atom
atom ::= "(" expr ")" | predicate
predicate ::= protocol | "port" op N | "length" op N
| "ip.addr"/"ip.src"/"ip.dst" op IP
| "info" "contains" "string"
| identifier # legacy bare-protocol or substring
| identifier N # sugar: "tcp 80" → tcp AND port == 80
op ::= == | != | < | > | <= | >=
```

Files

  • New: `SecVF/PacketFilter.swift` (~380 lines) — `PacketFilter`, `PacketLike`, tokenizer, recursive-descent parser, AST + evaluator
  • New: `SecVF/Tests/PacketFilterTests.swift` — 17 passing tests
  • `PacketAnalysisWindowController.swift` — `passesFilter` now uses the compiled AST; `setCurrentFilter` caches compile result so per-packet evaluation doesn't re-parse
  • `PacketFilterPresets.swift` — rewritten preset catalog with correct filter strings + honest titles
  • `PacketCaptureManager.swift` — `CapturedPacket` now conforms to `PacketLike` (zero-cost protocol)

Test plan

  • `xcodebuild build` succeeds locally
  • `xcodebuild test` — 267 / 267 passing (up from 249), one pre-existing flaky network test skipped as before
  • CI gate (`test.yml`) will run the same suite on this PR
  • Manually: open Packet Analysis window, type each preset's filter, verify packets pass/fail as expected
  • Manually: type a malformed filter into the live field, verify the table doesn't go empty (compile-fail falls back to no-filter)

Closes #11

🤖 Generated with Claude Code

Replaces the flat substring-split filter engine in
PacketAnalysisWindowController with a proper tokenizer + recursive-
descent parser + AST evaluator. Fixes the three bugs surfaced in
issue #11's review of PR #10.

The parser
==========

`PacketFilter.compile(_:)` produces an evaluable expression. The AST
is purely-functional + thread-safe; `compiledFilter` is cached on
the analysis controller so per-packet evaluation doesn't re-parse.

Grammar
-------
  expr      ::= orExpr
  orExpr    ::= andExpr ( "or" andExpr )*
  andExpr   ::= notExpr ( "and" notExpr )*
  notExpr   ::= "not" notExpr | atom
  atom      ::= "(" expr ")" | predicate
  predicate ::= protocol | "port" op N | "length" op N
              | "ip.addr"/"ip.src"/"ip.dst" op IP
              | "info" "contains" "string"
              | identifier              # legacy bare-protocol or substring
              | identifier N            # sugar: "tcp 80" -> tcp AND port == 80
  op        ::= == | != | < | > | <= | >=

Operator keywords (and/or/not/port/length/info/contains/ip.*) are
reserved; quote them with "..." to match those words literally.

Three review-bug regressions pinned in PacketFilterTests
========================================================

#1 Suspicious TLDs filter respects parens
   `dns and (info contains ".tk" or info contains ".ml" ...)` no
   longer falls into the " or " substring-split trap. HTTP traffic
   carrying "html" does NOT match the DNS-suspicious-TLDs filter
   anymore.

#2 Non-Standard Ports filter is numeric
   `port != 80` is a real comparison. Port 8080 / 5300 / 4430 pass
   through the "not 80 / not 443 / not 53 / not 22" filter as the
   user actually intends.

#3 Misleading presets honestly relabeled
   Eight presets promised semantics the engine can't reach without
   per-packet payload inspection (User-Agent matching, SYN-flag
   inspection, TLS record dissection). They've been split into two
   buckets:

   - Implemented via length predicates: "DNS Tunneling (Long
     Queries)" -> dns AND length > 100; "Large Outbound Transfers"
     -> tcp AND length > 1000; "ICMP with Payload" -> icmp AND
     length > 64.
   - Honestly renamed: "Non-Browser HTTP (curl/wget/python)" ->
     "HTTP (inspect for non-browser UA manually)". Same for the
     four TLS variants (collapsed to two entries), short-TCP
     beacons, base64-in-HTTP, and SYN-flood scanning. The
     showFilterInfo callout now prompts the analyst on what to
     inspect manually.

Tests
=====

PacketFilterTests: 17 new tests covering the three regression cases
above plus boolean precedence, atoms (proto / port / length / IP /
substring), `tcp N` sugar, whitespace + case-insensitivity, and
malformed-input error surfacing.

Full suite: 267 / 267 passing (up from 249), one pre-existing
network-dependent test still skipped as before.

CapturedPacket now conforms to `PacketLike` so the evaluator can
run against the real packet stream without a runtime adapter; the
protocol exists so unit tests can supply lightweight fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@DaxxSec DaxxSec merged commit af34b9a into main May 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Packet filter engine: parser doesn't handle parens or numeric exclusions correctly

1 participant