Skip to content

feat: WebP re-host, faithful ffmpeg CRF, and video resolution ladder for robustness aug#113

Merged
kenobijon merged 1 commit into
devfrom
feat/aug-webp-ffmpeg-resolution
Jun 19, 2026
Merged

feat: WebP re-host, faithful ffmpeg CRF, and video resolution ladder for robustness aug#113
kenobijon merged 1 commit into
devfrom
feat/aug-webp-ffmpeg-resolution

Conversation

@kenobijon

@kenobijon kenobijon commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #112 — extends the augmentation robustness pass with real-world distribution-channel transforms that the initial suite was missing. Scoped entirely to src/gasbench/processing/transforms.py. Targets dev so it stacks onto the same branch #112 ships from.

What changes

  1. WebP re-host pass — new compress_image_webp_pil() wired into the image chain between the two JPEG passes: downscale → JPEG q55 → WebP q75 → JPEG q80. Facebook/Google/many CDNs serve WebP, whose VP8 intra coding leaves a different artifact family than JPEG's DCT blocks, so a detector that survives repeated JPEG can still collapse on it. Pass webp_quality=None to recover the JPEG-only chain.

  2. JPEG chroma subsampling pinned to 4:2:0 (subsampling=2) in compress_image_jpeg_pil. The motivation in feat: augmentation robustness pass for image and video benchmarks #112 rests on chroma-subsampled JPEG destroying high-frequency DCT fingerprints, but Pillow's default subsampling varies by quality and version — pinning it ensures the cited mechanism is actually the one applied.

  3. Faithful FF++ CRF via ffmpeg — new _h264_roundtrip_ffmpeg() feeds raw frames to libx264 -crf N. Encode order is now ffmpeg -crf → cv2 avc1 → per-frame JPEG, and params["method"] records which path ran. cv2's VIDEOWRITER_PROP_QUALITY does not map to CRF and is silently ignored on many OpenCV builds, so the previous path did not reliably produce the requested CRF — this is the faithful reproduction of the FaceForensics++ c23/c40 protocol.

  4. Video resolution ladderscale_factor param on the video pass (default 1.0 = unchanged FF++ behavior; <1.0 downscales frames before encode to model platform transcodes like 1080→540).

Verification

Tested end-to-end against real numpy / opencv / Pillow / ffmpeg:

  • WebP roundtrip alters pixels (mean abs Δ ≈ 46) and shape/dtype preserved
  • Full image chain deterministic for a fixed seed; webp_quality=None opt-out produces a different (JPEG-only) result
  • Video path confirmed method == "ffmpeg_crf" with correct output shape under crf=40, scale_factor=0.5

Notes / follow-ups

  • Transform-layer only. To expose the new knobs from gasbench run, add --robustness-scale (video) and optionally --webp-quality in cli.py and thread through video_bench.py — not included here.
  • params keys changed: added webp_quality (image), method + scale_factor (video); removed cv2_quality. Worth a glance at recording.py if anything asserts on exact keys.

🤖 Generated with Claude Code


Note

Medium Risk
Robustness-pass pixels and logged transform params change, which can shift benchmark scores and break anything that asserted on cv2_quality; video encoding now depends on ffmpeg when available.

Overview
Extends the image robustness chain in apply_robustness_augmentations with an optional WebP roundtrip (compress_image_webp_pil, default q75) between the two JPEG steps, and records webp_quality in params (None skips WebP and restores JPEG-only). JPEG roundtrips now force 4:2:0 via subsampling=2 in compress_image_jpeg_pil.

Video robustness is refactored: shared _decode_video_rgb, ffmpeg libx264 -crf as the primary encode path (_h264_roundtrip_ffmpeg), cv2 avc1 as fallback (_h264_roundtrip_cv2), then per-frame JPEG. Returned params add method and scale_factor (optional pre-encode downscale; default 1.0); cv2_quality is removed.

Reviewed by Cursor Bugbot for commit d9148fb. Bugbot is set up for automated code reviews on this repo. Configure here.

…der to robustness aug

Extends the augmentation robustness pass with real-world distribution-channel
transforms missing from the initial suite:

- WebP roundtrip (compress_image_webp_pil) wired into the image chain between
  the two JPEG passes, modeling CDN/platform re-hosting (Facebook, Google).
  WebP's VP8 intra coding leaves a different artifact family than JPEG DCT, so
  detectors that survive repeated JPEG can still fail it. Opt out with
  webp_quality=None.
- JPEG chroma subsampling pinned to 4:2:0 (subsampling=2) so the pass actually
  applies the high-frequency DCT-fingerprint destruction its motivation cites,
  instead of letting Pillow vary subsampling by quality/version.
- Faithful FaceForensics++ CRF reproduction via the ffmpeg CLI
  (_h264_roundtrip_ffmpeg, real -crf). Encode order is now ffmpeg -> cv2 avc1
  -> per-frame JPEG, with the path used recorded in params["method"]. cv2's
  VIDEOWRITER_PROP_QUALITY does not map to CRF and is ignored on many builds,
  so the previous path did not reliably produce the requested CRF.
- Video resolution ladder via scale_factor (default 1.0 = unchanged FF++
  behavior; <1.0 downscales frames before encode to model platform transcodes).

Verified end-to-end with numpy/opencv/Pillow/ffmpeg: WebP alters pixels, the
full image chain is deterministic per seed, webp opt-out works, and the video
path engages the ffmpeg -crf encoder with correct output shape.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit d9148fb. Configure here.

"-s", f"{W}x{H}", "-r", str(int(fps)), "-i", "-",
"-c:v", "libx264", "-crf", str(int(crf)),
"-pix_fmt", "yuv420p", tmp_path,
]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd sizes break ffmpeg CRF

Medium Severity

_h264_roundtrip_ffmpeg encodes with libx264 and -pix_fmt yuv420p but never forces even width or height. libx264 rejects odd dimensions, so the subprocess often exits non‑zero, the helper returns None, and robustness runs fall back to cv2 or per‑frame JPEG despite ffmpeg being on PATH and method sometimes implying a CRF roundtrip that did not happen.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d9148fb. Configure here.

@kenobijon kenobijon merged commit 453f447 into dev Jun 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant