Version: 3.0
Updated: 2026-04-13
Baseline: v1.9.26 (254 routes, 68 core modules, 17 blueprints, 867 tests)
Feature Plan: 302 features across 62 categories (see features.md)
⚡ Active work (2026-Q2 → 2026-Q4) lives in ROADMAP-NEXT.md — Wave A/B/C/D/E covering v1.18.0 → v1.23.0+ from the cross-project OSS research pass. This file is preserved as the long-range 302-feature plan and the v1.14.0 strategic-gap analysis below.
- Never break what works — Every wave ships a working product. No "rewrite everything then test."
- Incremental migration — New code coexists with old. Feature flags gate rollout. Old paths removed only after new paths are proven.
- User-facing value first — Each wave delivers visible improvements, not just internal refactors.
- Measure before optimizing — Add telemetry/logging before assuming bottlenecks.
- Shared infrastructure first — When multiple features need the same foundation (e.g., object tracking, spectral analysis), build the foundation once, then fan out.
- One new dependency per feature maximum — Avoid dep explosion. Prefer extending existing deps (OpenCV, FFmpeg, Pillow) over adding new ones.
Completed work (v1.0 - v1.9.26) moved to ROADMAP-COMPLETED.md.
Features are organized into 7 waves based on dependency chains, shared infrastructure, and priority. Each wave is independently shippable. Feature numbers reference features.md.
| Symbol | Meaning |
|---|---|
| FFmpeg | Pure FFmpeg filter — no Python deps beyond subprocess |
| Pillow | Image composition — already installed |
| OpenCV | Computer vision — already installed (opencv-python-headless) |
| Existing AI | Uses models already in the codebase (Whisper, Demucs, face detection, etc.) |
| New dep | Requires a new pip dependency |
| New model | Requires downloading a new AI model (potentially large) |
| Pipeline | Orchestrates existing modules — no new deps |
Goal: Ship 40+ features using only existing FFmpeg filters, Pillow, NumPy, and current AI models. Maximum user value with minimum risk.
Timeline: 4-6 weeks New deps: Zero New routes: ~35
These are pure FFmpeg filter additions — each is a new route calling run_ffmpeg() with a new filter graph.
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 53.2 | Adaptive Deinterlacing | S | FFmpeg | yadif/bwdif filter. Auto-detect via ffprobe field_order or idet filter. |
| 52.1 | Lens Distortion Correction | M | FFmpeg | lenscorrection filter with k1/k2 coefficients. Ship camera profile JSON (source: lensfun). |
| 52.3 | Chromatic Aberration Removal | S | FFmpeg | chromanr filter or per-channel scale via split/scale/merge. |
| 53.5 | Frame Rate Conversion (Optical Flow) | M | FFmpeg | minterpolate filter for up/down conversion. Preset modes. |
| 44.1 | Timecode Burn-In Overlay | S | FFmpeg | drawtext with %{pts\:hms} or timecode option. Configurable position/font. |
| 45.2 | AV1 Encoding Support | M | FFmpeg | libaom-av1 or libsvtav1 encoder. Add to export presets and social platform presets. |
| 45.1 | ProRes Export on Windows | M | FFmpeg | prores_ks encoder. Profile selector (Proxy/LT/422/HQ/4444). |
| 32.1 | Hardware-Accelerated Encoding | M | FFmpeg | Detect NVENC/QSV/AMF. Add h264_nvenc/hevc_nvenc codec options in export. |
| 20.4 | Photosensitive Seizure Detection | S | FFmpeg | Frame-to-frame luminance delta analysis. Flag >3 flashes/sec per ITU-R BT.1702. |
| 38.1 | GIF / WebP / APNG Export | S | FFmpeg | gif/libwebp_anim output format. Palette optimization via palettegen/paletteuse. |
| 3.10 | Film Grain & Vignette (Enhanced) | S | FFmpeg | noise + vignette filters with presets (Super 8, 16mm, 35mm, VHS). |
| 25.1 | Dialogue De-Reverb | M | FFmpeg | arnndn or afftdn with speech-optimized profile. |
| 42.2 | Timelapse Deflicker | M | FFmpeg | deflicker filter or rolling-average luminance normalization per frame. |
| 30.3 | Freeze Frame Insert | S | FFmpeg | Extract frame at timestamp, generate still clip of configurable duration, splice into sequence. |
Image composition overlays using existing Pillow renderer.
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 61.1 | Composition Guide Overlay | S | Pillow | Rule-of-thirds, golden ratio, center cross, safe areas on preview frame. Display-only. |
| 36.1 | Platform Safe Zone Overlay | S | Pillow | TikTok/YouTube/Instagram UI element overlays on preview frame. JSON-driven zone definitions. |
| 34.1 | Scrolling Credits Generator | M | Pillow | Bottom-to-top scroll rendered as video via Pillow frame sequence + FFmpeg encode. |
| 34.3 | Lower Third Generator | M | Pillow | Name/title bar with configurable style presets. Burn into video at timestamp range. |
| 20.3 | Color Blind Simulation Preview | S | Pillow | Apply CVD color matrix (deuteranopia, protanopia, tritanopia) to preview frame. |
| 11.2 | Click & Keystroke Overlay | M | Pillow | Parse click/key logs → render ripple animations and key badges as overlay frames. |
| 11.3 | Callout & Annotation Generator | M | Pillow | Numbered callouts, spotlight boxes, blur regions, arrows at timestamps. |
| 18.2 | Retro VHS / CRT Effect | M | Pillow+FFmpeg | Scanlines, chroma shift, noise, tracking artifacts, date stamp. Preset chain. |
| 18.3 | Glitch Effect Pack | M | Pillow+FFmpeg | Datamosh, RGB shift, block displacement, scan distortion. Per-frame render. |
| 48.1 | Highlight Reel Auto-Assembly | M | Pipeline | Score clips by audio energy + motion → select top N → assemble with transitions + music. |
Features that extend already-installed AI models with new analysis modes.
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 55.3 | Profanity Bleep Automation | S | Existing AI | Whisper word timestamps + configurable word list → 1kHz tone or silence at flagged words. |
| 61.2 | Shot Type Auto-Classification | M | Existing AI | Face size relative to frame (MediaPipe) → ECU/CU/MCU/MS/WS classification per scene. |
| 29.1 | Shot Type Search & Tagging | M | Existing AI | Store shot type in footage index (FTS5). Enable search by shot type. |
| 56.4 | Room Tone Auto-Generation | M | NumPy | Analyze quiet segments → spectral envelope → shape white noise to match → fill cuts. |
| 61.3 | Intelligent Pacing Analysis | M | Existing AI | Scene detection cut points → mean/median/stddev shot lengths → genre benchmark comparison. |
| 28.1 | Black Frame / Frozen Frame Detection | S | FFmpeg+OpenCV | blackdetect filter + frame differencing for frozen frames. Report timestamps. |
| 28.2 | Audio Phase & Silence Gap Check | S | FFmpeg | aphasemeter + silence detection. Flag phase issues and unnatural gaps. |
| 4.8 | Best Take Selection | M | Existing AI | Per-take scoring: audio quality (SNR), face visibility, sharpness, duration. Rank takes. |
| 11.5 | Dead-Time Detection & Speed Ramp | S | Existing AI | Frame differencing (scene_detect) + silence detection → speed-ramp or cut dead time. |
| 52.4 | Lens Profile Auto-Detection | S | FFmpeg | Parse camera model from ffprobe metadata → look up in lensfun JSON database. |
New composite video modes using FFmpeg overlay/hstack/vstack.
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 57.1 | Split-Screen Layout Templates | M | FFmpeg | JSON layout definitions (cells with x/y/w/h %). Composite via overlay filter chain. |
| 57.2 | Reaction Video Template | M | FFmpeg | Main content + PiP webcam. Auto-sync via audio cross-correlation. Audio ducking. |
| 57.3 | Before/After Comparison Export | M | FFmpeg | hstack/vstack, animated wipe via overlay + keyframed crop. Label overlay. |
| 57.4 | Multi-Cam Grid View Export | M | FFmpeg | 2x2 to 4x4 grid. Optional active-speaker highlight border from diarization data. |
| 6.3 | Side-by-Side Before/After Preview | M | FFmpeg | Preview modal showing original vs processed frame. Slider wipe in panel. |
| 3.9 | Multi-Camera Audio Sync | M | FFmpeg+NumPy | Audio fingerprint cross-correlation for time offset detection. Multicam XML output. |
Wave 1 Total: ~40 features, 0 new dependencies, ~35 new routes
Goal: Build high-value composite workflows that chain existing modules into new products. These are the features that competitors charge monthly subscriptions for.
Timeline: 3-5 weeks (can overlap with Wave 1) New deps: Zero (all existing) New routes: ~20
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 58.1 | Long-Form to Multi-Short Extraction | L | Pipeline | Transcribe → LLM highlights (N clips) → per-clip: trim + face-reframe 9:16 + burn captions + export. Folder of numbered shorts + metadata CSV. |
| 58.4 | Podcast Episode Bundle | M | Pipeline | Denoise + normalize → clean audio export → transcribe → chapters → highlight clips → audiogram → show notes → transcript. All outputs in timestamped folder. |
| 54.4 | AI Video Summary / Condensed Recap | M | Pipeline | Scene detect → transcript LLM analysis → engagement scoring → select top N shots → trim 3-5s each → assemble with crossfades. Configurable target duration. |
| 58.2 | Video-to-Blog-Post Generator | M | Pipeline | Transcribe → LLM structured article with section headings → extract key frames at section boundaries → assemble markdown + images folder. |
| 58.3 | Social Media Caption Generator | S | Pipeline | Per-exported-clip: extract transcript → LLM generates platform-optimized post caption (char limits, hashtags, tone). JSON output alongside each clip. |
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 53.3 | Old Footage Restoration Pipeline | L | Pipeline | Stabilize → deinterlace (53.2) → denoise (temporal) → upscale (Real-ESRGAN) → color restore → frame rate conversion. VHS/8mm/Early Digital presets. |
| 40.3 | Video Podcast to Audio-Only | S | FFmpeg | Extract audio track, normalize, denoise, export as podcast-ready MP3/WAV with ID3 tags. |
| 40.4 | Podcast Show Notes Generator | M | Pipeline | Transcribe → LLM: summary, key topics with timestamps, pull quotes, mentioned resources, chapter markers. Markdown/HTML output. |
| 12.3 | Auto Montage Builder | M | Pipeline | Score clips (audio energy + motion) → select top N → detect beats in music track → trim clips to beat intervals → concatenate with transitions. |
| 14.1 | Paper Edit / Script Sync | L | Pipeline | Import script text → fuzzy-match against transcript → generate organized clip assembly with confidence scores. |
| 4.1 | Watch Folder / Hot Folder | M | Pipeline | Monitor directory for new files → auto-run configured workflow → output to destination folder. Background polling with configurable interval. |
| 4.2 | Render Queue | M | Pipeline | Queue multiple export jobs with different settings. Sequential execution with progress tracking. Notification on batch completion. |
| 5.1 | Multi-Platform Batch Publish | L | Pipeline | Single source → batch export for YouTube + TikTok + Instagram + LinkedIn with per-platform reframe, caption style, loudness target, and metadata. |
| # | Feature | Effort | Dep | Detail |
|---|---|---|---|---|
| 24.1 | Shot-Change-Aware Subtitle Timing | M | Pipeline | Scene detection (existing) → post-process captions: split at cut boundaries with minimum gap. Integrate into caption generation pipeline. |
| 16.1 | Beat-Synced Auto-Edit | L | Pipeline | Detect beats (existing librosa) → scene detect → align cuts to nearest beat → assemble. Music video editing automation. |
| 36.4 | Vertical-First Intelligent Reframe | M | Pipeline | Saliency detection + face tracking → auto-crop to 9:16 with smooth path. Better than center-crop for non-face content. |
| 30.1 | Ripple Trim / Gap Close | M | ExtendScript | After cut application, auto-close gaps by rippling subsequent clips. ExtendScript removeEmptyTrackItems(). |
Wave 2 Total: ~17 features, 0 new dependencies, ~20 new routes
Goal: Complete the remaining architectural work that enables heavy AI features in Waves 4-7. These are not user-facing but are prerequisites for scale.
Timeline: 6-10 weeks (runs in parallel with Waves 1-2) Dependencies: Internal refactoring
The single most important infrastructure change. Every AI feature in Waves 4-7 benefits from this.
| Task | Detail |
|---|---|
| Worker pool architecture | opencut/workers/ with WorkerManager. Workers are separate Python processes per model family (whisper, demucs, realesrgan, depth, generation). |
| IPC protocol | Workers communicate via localhost HTTP (minimal Flask on random port) or multiprocessing.Queue. Job dispatcher routes by type. |
| GPU memory management | Worker reports VRAM on startup. Dispatcher checks available VRAM against model's known requirement before scheduling. Workers exit after 5-min idle to free VRAM. |
| Graceful degradation | GPU OOM → specific guidance ("Model needs 4GB VRAM, you have 2GB. Switching to CPU.") → optional CPU re-dispatch. |
| Model registry | models.json mapping model name → VRAM requirement, download size, expected load time. UI shows this info. |
Deliverable: No more OOM crashes from model conflicts. GPU utilization visible in status bar.
CEP end-of-life is approximately September 2026. UXP must be production-ready before then.
| Task | Detail |
|---|---|
| Shared component library | extension/shared/ with framework-agnostic components. Both CEP and UXP import from here. Build system outputs two bundles. |
| Feature registry | features.json defines every feature: id, label, endpoint, params schema, requires. Both panels auto-generate UI from this. Adding a feature = one JSON entry + one backend route. |
| UXP feature gap closure | Port remaining ~15% of CEP features to UXP. Mostly: workflow builder, full settings panel, plugin UI. |
| Native UXP timeline access | Replace ExtendScript evalScript() with direct premierepro UXP module for timeline read/write. 10x faster. |
| Premiere menu integration | Right-click → "OpenCut: Remove Silence" / "Add Captions" / "Normalize Audio" via UXP API. |
| CEP deprecation plan | Mark CEP panel as "legacy" in docs. Freeze CEP feature additions. All new features UXP-only after Wave 3. |
Deliverable: UXP panel at 100% parity. CEP can be removed when Adobe enforces it.
Low priority. Flask works fine at current scale. Migrate only if:
- Request validation boilerplate becomes unmanageable (>300 routes)
- WebSocket needs outgrow the current
websocketslibrary - Auto-generated OpenAPI docs become essential for plugin developers
If triggered, migrate one blueprint at a time (system → settings → search → nlp → timeline → jobs → captions → audio → video). Pydantic models replace safe_float()/safe_int() hand-validation.
Continue incremental migration as files are touched. Priority order:
- API layer (
src/api/types.tsfrom OpenAPI schema) - Store/state management
- Tab modules as they're refactored for new features
No dedicated sprint. Piggyback on feature work.
Goal: Add new feature domains that require 1-2 new dependencies each but significantly expand OpenCut's capability.
Timeline: 6-8 weeks (after Wave 1, can overlap with Wave 3) New deps: 4-6 new pip packages New routes: ~30
Shared infrastructure: object detection framework, tracking pipeline, audio masking.
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 55.1 | License Plate Detection & Blur | M | paddleocr or YOLO plate model |
Detect plates per frame → track with IoU → Gaussian blur on tracked regions. |
| 55.3 | Profanity Bleep Automation | S | None (done in Wave 1) | — |
| 55.2 | OCR-Based PII Redaction | L | paddleocr (shared with 55.1) |
OCR → regex PII patterns (SSN, phone, email, CC) → NER for names → track text regions → blur. |
| 55.4 | Document & Screen Redaction | M | OpenCV (existing) | Edge detection → perspective transform → classify as screen/document/whiteboard → blur surface. |
| 55.5 | Audio Speaker Anonymization | M | Existing (pedalboard) | Diarize → target speaker segments → pitch shift + formant shift or TTS resynthesis. |
New dependency: paddleocr (or reuse existing Tesseract if sufficient). One dep serves 55.1 + 55.2.
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 52.2 | Rolling Shutter Correction | L | gyroflow CLI (subprocess) |
Integrate Gyroflow as subprocess with lens profiles. Parse gyro metadata from GoPro/DJI. |
| 13.4 | LOG / Camera Profile Pipeline | M | None | Auto-detect LOG profile from ffprobe metadata → apply bundled technical LUT (free Sony/Canon/Panasonic LUTs). |
| 43.4 | Color Space Auto-Detection | M | None | Read color_primaries/transfer_characteristics from ffprobe → auto-apply correct input transform. |
New dependency: gyroflow CLI (optional, subprocess only — not a pip package).
Shared infrastructure: STFT analysis/resynthesis pipeline.
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 56.4 | Room Tone Auto-Generation | M | None (done in Wave 1) | — |
| 56.3 | AI Environmental Noise Classifier | M | tensorflow-lite or onnxruntime (existing) |
YAMNet model (521 sound classes, TFLite). Classify → selective removal via spectral masking. |
| 56.2 | Spectral Repair / Frequency Removal | M | librosa (existing) |
STFT → identify persistent spectral peaks (hum/buzz) → attenuate → inverse STFT. Auto-detect mode. |
| 56.1 | Visual Spectrogram Editor | L | librosa (existing) |
FFmpeg showspectrumpic or librosa STFT → zoomable canvas in panel → brush tool mask → inverse STFT reconstruction. |
New dependency: None if using onnxruntime (already installed) for YAMNet. Otherwise tflite-runtime (lightweight).
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 60.1 | Auto Proxy Generation | L | None | Detect clips >1080p → FFmpeg scale to target res + CRF 28 → store in ~/.opencut/proxies/ with manifest. Background job. |
| 60.2 | Proxy-to-Full-Res Swap on Export | S | None | Query timeline clip paths via ExtendScript → check against proxy manifest → verify originals exist → report. |
| 60.3 | Media Relinking Assistant | M | None | ExtendScript: enumerate offline items. Python: recursive search by filename + size matching. Batch relink UI. |
| 60.4 | Duplicate Media Detection | M | None | File size grouping → partial hash (first+last 64KB) → full hash for matches. Optional pHash for content matches. |
New dependency: None.
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 13.1 | Real-Time Color Scopes | L | FFmpeg+Pillow | FFmpeg waveform, vectorscope, histogram filters render scope images. Display as image grid in panel. |
| 13.5 | Film Stock Emulation | M | None | Custom 3D LUTs per stock (Kodak/Fuji) + grain overlay + gate weave + halation via blend. Preset package. |
| 13.4 | LOG Camera Profile Pipeline | M | None (listed in 4B) | — |
| 43.1 | ACES Color Pipeline | L | None | ACES IDT/ODT via FFmpeg colorspace + lut3d. Bundled ACES LUTs (free from AMPAS). |
New dependency: None (FFmpeg + bundled LUT files).
Wave 4 Total: ~18 features (excluding duplicates from Wave 1), 1-2 new deps, ~30 new routes
Goal: Build the end-to-end AI dubbing pipeline — the single highest-value new AI capability. This is what ElevenLabs, HeyGen, and Rask.ai charge $50-100/month for.
Timeline: 4-6 weeks (after Wave 3A process isolation is ready) Prerequisite: Wave 3A (GPU process isolation) — dubbing loads multiple large models sequentially New deps: Minimal (leverages existing Chatterbox, Whisper, Demucs, SeamlessM4T)
| # | Feature | Effort | Detail |
|---|---|---|---|
| 62.1 | End-to-End AI Dubbing Pipeline | XL | Transcribe → translate (SeamlessM4T) → voice-clone TTS (Chatterbox) with duration constraints → stem-separate original (Demucs, remove dialogue, keep music/SFX) → mix dubbed dialogue + original music/SFX → export. |
| 62.2 | Isochronous Translation | L | LLM-assisted translation constrained by segment duration. Iterate: translate → estimate TTS duration from syllable count → if too long, ask LLM to rephrase shorter → if too short, expand. Target +-10% of original. |
| 62.3 | Multi-Language Audio Track Management | M | FFmpeg -map to mux multiple audio streams with language metadata. Panel UI: track list with language dropdown, add/remove, default flag. Export multi-track MKV/MP4 or per-language files. |
| 62.4 | Emotion-Preserving Voice Translation | L | Extract prosody (F0 contour via librosa, RMS energy, speaking rate) from original → generate TTS with neutral prosody → transfer original prosody shape to dubbed audio via WORLD vocoder or pitch manipulation. |
Workflow chain: The dubbing pipeline calls 5 existing modules in sequence. The key new code is the orchestrator (core/dubbing.py) and the isochronous translation loop (core/isochron_translate.py).
New dependency: Potentially pyworld for vocoder-based prosody transfer (62.4). Everything else is already installed.
Wave 5 Total: 4 features, 0-1 new deps, ~8 new routes
Goal: Deep features for professional editors, colorists, and post-production specialists. These differentiate OpenCut from casual tools.
Timeline: 8-12 weeks (can be worked on in parallel tracks) New deps: 2-4
| # | Feature | Effort | Detail |
|---|---|---|---|
| 61.4 | Saliency-Guided Auto-Crop | M | Face regions (high weight) + motion regions (frame diff) + text regions (OCR) + high-contrast edges → weighted heat map → place crop to maximize saliency. |
| 13.2 | Three-Way Color Wheels | L | SVG color wheel widgets in panel → map wheel positions to FFmpeg colorbalance filter values (lift/gamma/gain). Preview via frame extraction. |
| 13.3 | HSL Qualifier / Secondary Correction | L | OpenCV HSV range masking with feathered edges → apply corrections to masked region only → composite. Preview matte in panel. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 59.4 | Script-to-Rough-Cut Assembly | XL | Batch transcribe all footage → fuzzy-match transcript segments against script text → rank matches by similarity + audio quality + face visibility → assemble best take per segment as OTIO/Premiere XML. |
| 59.2 | Shot List Generator from Screenplay | M | Parse screenplay format (INT./EXT., ACTION, DIALOGUE) → LLM suggests shot count and camera angles per scene → export as CSV. |
| 59.1 | AI Storyboard Generation from Script | L | Parse script into shots → generate one image per shot via Stable Diffusion or external API → layout as storyboard grid with descriptions → export PDF. |
| 59.3 | Mood Board Generator from Footage | M | Extract keyframes → k-means color clustering → style tags (warm/cold, contrast, saturation) → suggest matching LUTs → compile as visual reference image. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 53.1 | Corrupted File Recovery | M | Detect corruption type (missing moov, truncated stream). For missing moov: untrunc algorithm with reference file. For truncated: ffmpeg -err_detect ignore_err salvage. Report recovery stats. |
| 53.4 | SDR-to-HDR Upconversion | L | FFmpeg zscale (bt709 → bt2020) + inverse tone mapping. Apply PQ/HLG transfer function. Embed ST.2086 metadata. |
| 13.6 | Power Windows with Tracking | L | Shape masks (circle, rect, polygon) in panel → track via MediaPipe (face) or SAM2 (object) → apply corrections inside/outside mask via per-frame FFmpeg filter. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 35.1 | Selective Redaction Tool | M | Click-to-select regions in preview → track across frames → blur/pixelate/black. Export redaction log for audit trail. |
| 35.2 | Chain of Custody Metadata | S | SHA-256 hash of original + all operations applied + timestamps → embed as metadata or export as sidecar JSON. |
| 35.3 | Forensic Enhancement | M | Stabilize + denoise + sharpen + contrast stretch + frame interpolation for low-quality surveillance footage. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 20.1 | Caption Compliance Checker | M | Parse captions → check against rulesets (Netflix <=42 CPL, FCC <=32 CPL, BBC <=160 WPM, min duration, CPS). Flag violations with auto-fix suggestions. |
| 20.2 | Audio Description Track Generator | L | Detect dialogue pauses (existing VAD) → extract key frames during pauses → describe via LLM vision → TTS synthesis → mix into gaps → export as AD track. |
| 27.1 | C2PA Content Credentials | M | Embed Content Authenticity Initiative metadata (origin, edit history, AI disclosure). c2pa-python library. |
Wave 6 Total: ~16 features, 2-3 new deps, ~25 new routes
Goal: Forward-looking AI capabilities and niche professional features. These are differentiators, not table-stakes.
Timeline: Ongoing (8-16 weeks, lowest priority) New deps: Several (heavy AI models) Prerequisite: Wave 3A (GPU process isolation) essential for multiple large models
| # | Feature | Effort | New Dep | Detail |
|---|---|---|---|---|
| 54.2 | Image-to-Video Animation | L | diffusers (existing) |
SVD or CogVideoX with image conditioning → 2-6s clip from still image + motion prompt. |
| 54.5 | AI Background Replacement | L | diffusers (existing) |
RVM foreground extraction + Stable Diffusion background from text prompt → composite. |
| 54.1 | AI Outpainting / Frame Extension | L | diffusers (existing) |
Extend canvas to target aspect ratio → inpaint borders via ProPainter or SD. Keyframe-based for temporal consistency. |
| 54.3 | AI Scene Extension | XL | diffusers (existing) |
Feed last N frames to video prediction model → generate continuation. Best for static scenes. |
| 21.1 | Multimodal Timeline Copilot | XL | LLM API (existing) | Chat interface backed by multimodal AI that sees video + audio + transcript. Navigate, select, edit via natural language. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 51.2 | Equirectangular to Flat Projection | M | FFmpeg v360 filter. Keyframeable yaw/pitch/roll for virtual camera paths. |
| 51.3 | FOV Region Extraction from 360 | M | Face detection in equirectangular space → per-speaker flat extraction with smooth tracking → multicam XML. |
| 51.1 | 360 Video Stabilization | L | Parse gyro metadata (GoPro GPMF, Insta360) → apply inverse rotation via FFmpeg v360. |
| 51.4 | Spatial Audio Alignment | L | Map speaker positions from face detection → route mono dialogue to correct ambisonic channel. First-order ambisonics output. |
| # | Feature | Effort | Detail |
|---|---|---|---|
| 41.1 | DJI Telemetry Data Overlay | M | Parse DJI SRT files → render altitude, speed, GPS, battery as configurable overlay. |
| 42.1 | Image Sequence Import & Assembly | M | Import folder of images (TIFF, EXR, DPX, PNG) → assemble as video with configurable FPS and transitions. |
| 39.1 | Elgato Stream Deck Integration | M | WebSocket/HTTP listener for Stream Deck commands → map buttons to OpenCut operations. Plugin for Stream Deck SDK. |
| 12.1 | Gaming Highlight / Kill Detection | L | Multi-signal fusion: audio peaks + motion intensity + optional OCR on kill feed → score segments → extract top clips. |
| 33.1 | Lecture Recording Auto-Split | M | Scene detection + chapter generation → split lecture by topic → generate per-topic clips with title cards. |
| 46.1 | Multi-Step Autonomous Editing Agent | XL | LLM plans editing steps from high-level instruction → executes via OpenCut API → iterates on result quality. Full agent loop with human review checkpoints. |
Wave 7 Total: ~15 features, 0-2 new deps (most already installed), ~20 new routes
Wave 1 (Quick Wins) |=============================|
Wave 2 (Pipelines) |=======================|
Wave 3A (GPU Isolation) |========================|
Wave 3B (UXP Parity) |=====================|
Wave 4 (New Domains) |========================|
Wave 5 (AI Dubbing) |================|
Wave 6 (Pro Features) |===========================|
Wave 7 (Emerging) |=========================>
Wk 1 Wk 6 Wk 12 Wk 18 Wk 24 Wk 30+
Critical path: Wave 3A (GPU isolation) must land before Waves 5 and 7A (heavy AI features).
Parallel tracks:
- Wave 1 + Wave 2 can run simultaneously (different developers or even same developer — no conflicts)
- Wave 3A + Wave 3B are independent
- Wave 4 can start as soon as Wave 1 is done (shares no code)
- Wave 6 features are independent of each other (can be cherry-picked)
| Milestone | Routes | Core Modules | Tests (est.) |
|---|---|---|---|
| Current (v1.9.26) | 254 | 68 | 867 |
| After Wave 1 | ~290 | ~78 | ~1,050 |
| After Wave 2 | ~310 | ~85 | ~1,200 |
| After Wave 4 | ~340 | ~95 | ~1,400 |
| After Wave 5 | ~348 | ~99 | ~1,500 |
| After Wave 6 | ~373 | ~110 | ~1,700 |
| After Wave 7 | ~393 | ~120 | ~1,900 |
| # | Feature | Wave | Effort | Why Critical |
|---|---|---|---|---|
| 3A | GPU Process Isolation | 3 | XL | Prerequisite for all heavy AI features. Eliminates OOM crashes. |
| 3B | UXP Full Parity | 3 | XL | CEP end-of-life ~Sept 2026. Must be ready before then. |
| 32.1 | Hardware-Accelerated Encoding | 1 | M | Users with GPUs expect NVENC/QSV. Every other tool has this. |
| 58.1 | Long-Form to Multi-Short Extraction | 2 | L | $228/year competitor (Opus Clip). Highest-value pipeline. |
| # | Feature | Wave | Effort | Why High Impact |
|---|---|---|---|---|
| 62.1 | End-to-End AI Dubbing | 5 | XL | $50-100/month competitor category. Uses all existing modules. |
| 57.1 | Split-Screen Templates | 1 | M | CapCut/iMovie table-stakes. Massive content category. |
| 55.1 | License Plate Blur | 4 | M | Privacy law compliance. Every content creator needs this. |
| 55.3 | Profanity Bleep | 1 | S | Broadcast requirement. Trivial to build. |
| 53.2 | Adaptive Deinterlacing | 1 | S | Every NLE has this. Legacy footage is common. |
| 52.1 | Lens Distortion Correction | 1 | M | Standard camera correction. lensfun database is free. |
| 56.4 | Room Tone Auto-Generation | 1 | M | iZotope RX feature. Makes silence removal sound professional. |
| 60.1 | Auto Proxy Generation | 4 | L | Premiere/Resolve/FCPX all have this. 4K editing prerequisite. |
| 61.2 | Shot Type Classification | 1 | M | Enables intelligent editing decisions and footage search. |
| 45.2 | AV1 Encoding | 1 | M | Modern codec with 30-50% bitrate savings. YouTube prefers it. |
| 45.1 | ProRes Export (Windows) | 1 | M | Professional delivery format. Resolve offers this on Windows. |
| 13.1 | Real-Time Color Scopes | 6 | L | Every colorist needs scopes. Color tools are blind without them. |
| 59.4 | Script-to-Rough-Cut | 6 | XL | Biggest time saver in post-production. Avid ScriptSync competitor. |
| 20.1 | Caption Compliance Checker | 6 | M | Netflix/FCC/BBC requirements. Prevents platform rejection. |
| 24.1 | Shot-Change-Aware Subtitle Timing | 2 | M | Broadcast QC standard. Simple post-processing. |
All remaining Wave 1-6 features not listed above (~60 features).
All Wave 7 features + FastAPI migration + TypeScript + niche items (~40 features).
| Metric | v1.9.26 (Current) | After Waves 1-2 | After Waves 3-5 | After Waves 6-7 |
|---|---|---|---|---|
| API routes | 254 | ~310 | ~348 | ~393 |
| Core modules | 68 | ~85 | ~99 | ~120 |
| Tests | 867 | ~1,200 | ~1,500 | ~1,900 |
| Time to first useful action | ~30s (workflow) | ~15s (pipeline) | ~10s (context + agent) | ~5s (copilot) |
| Install success rate | ~90% | ~92% | ~95% (isolation) | ~99% (Docker) |
| Competitor features covered | ~60% | ~75% | ~85% | ~95% |
| Features available in UXP | ~85% | ~90% | 100% | 100% |
| New deps added | 0 | 0 | 1-2 | 4-6 |
| Risk | Impact | Mitigation |
|---|---|---|
| CEP deprecation before UXP ready | High | Wave 3B is P0. Start immediately. Freeze CEP feature additions. |
| GPU process isolation complexity | High | Start with simple subprocess model. Upgrade to full worker pool later. Ship incremental improvements. |
| AI model download sizes | Medium | Models are optional. Clear size warnings in UI. Pre-download in installer. Offer cloud API fallback where possible. |
| Too many features → quality regression | High | Every new feature gets a smoke test before merge. Ruff lint on CI. No feature without a test. |
| Dependency conflicts from new packages | Medium | One new dep per feature max. Pin versions. Test in isolated venv before adding to pyproject.toml. |
| Scope creep from 302-feature plan | Medium | Waves are independently shippable. Only commit to one wave at a time. Review and reprioritize between waves. |
This roadmap should be reviewed at the start of each wave and reprioritized based on user feedback, competitive landscape changes, and lessons learned from the previous wave.
Auditor: Principal Systems Architect analysis Date: 2026-04-14 Baseline: v1.14.0 (1,088 routes, 408 core modules, 83 blueprints, 87 test files, 6,925 tests) Method: Full codebase scan, security audit, architecture bottleneck analysis, test/CI pipeline review
Context: This roadmap was authored at v1.9.26 (254 routes, 68 modules). The codebase has since grown 4.3x in routes and 6x in modules. The Wave 1-7 structure and growth projections are now obsolete — the "After Wave 7" target of ~393 routes was surpassed at v1.10.5. This analysis identifies the gaps that the rapid feature expansion has opened.
-
GPU process isolation is still unimplemented (Wave 3A). This was marked P0 and remains the single most critical infrastructure gap.
MAX_CONCURRENT_JOBS = 10inopencut/jobs.py:42allows 10 simultaneous ML model loads into VRAM. PyTorch models (Demucs, Real-ESRGAN, InsightFace, SAM2, CLIP, etc.) each consume 500MB-4GB VRAM. Concurrent loads will OOM on consumer GPUs. No memory reservation, no model-aware scheduling, no graceful degradation path exists. Every AI feature added since v1.10 has widened this gap. The 408-module codebase now has 40+ modules that load GPU models — 6x more than when Wave 3A was planned.- Recommended action: Implement a GPU memory budget system immediately. At minimum: reduce
MAX_CONCURRENT_JOBSto 3 for GPU-tagged routes, add a@gpu_exclusivedecorator that serializes GPU model access behind a semaphore, and report VRAM usage in/system/status.
- Recommended action: Implement a GPU memory budget system immediately. At minimum: reduce
-
Rate limiting covers 4% of async routes. Security audit found 597 async route handlers but only 23 rate-limit calls. The
require_rate_limit()decorator exists and works, but was only applied to model-install and a handful of AI routes. All 574 unprotected async routes accept concurrent requests limited only byMAX_CONCURRENT_JOBS=10. A single client can trivially exhaust all 10 job slots with expensive operations (batch rendering, video processing, ML inference), starving other requests.- Recommended action: Introduce rate-limit categories (
gpu_heavy,cpu_heavy,io_bound,light) and apply to all async routes. GPU-heavy operations should share a pool of 2-3 concurrent slots. CPU-heavy should cap at 4-6.
- Recommended action: Introduce rate-limit categories (
-
Test coverage is broad but shallow. 87 test files exist with 6,925 test functions, but the architecture audit reveals 97% of the 408 core modules lack dedicated behavioral tests — they're only exercised indirectly through route smoke tests. The smoke tests in
test_route_smoke.pyuse broad status code assertions likeassert resp.status_code in (200, 400, 429)which pass regardless of whether the feature works correctly. CI enforces only 50% line coverage (--cov-fail-under=50inbuild.yml), which is insufficient for a codebase of this size and complexity.- Recommended action: Raise CI coverage threshold to 65% (target 80% over 2 sprints). Add schema validation for route responses (JSON structure, not just "is JSON"). Prioritize integration tests for the 40 GPU-model-loading modules — these are the highest-risk code paths with the least coverage.
-
Roadmap growth projections are 3x out of date. The "Route Growth Projection" table estimates 393 routes after all 7 waves. Actual count is 1,088 — a 2.8x overshoot. The "Success Metrics" table, "Completed Work" section, and wave feature lists don't reflect v1.10-v1.14 additions (categories 63-77, 155 new core modules, 20 new route blueprints). The roadmap should be rebased to reflect current reality so it can be trusted for planning.
- Recommended action: Rebase all tables to v1.14.0 actuals. Mark Wave 1-2 features that were implemented in v1.10-v1.14 as DONE. Update dependency legend with new module families. Revise success metrics to reflect 1,088-route baseline.
-
helpers.pyis a god module (350 imports). Every core module and most route files import fromopencut/helpers.py. It contains FFmpeg execution, video probing, output path logic, temp file cleanup, package installation, and progress utilities — responsibilities that span 6+ concerns. This makes it a merge conflict magnet, impossible to test in isolation, and a startup bottleneck (every import chain pulls in the entire module).- Recommended action: Decompose into
helpers/ffmpeg.py,helpers/video_probe.py,helpers/paths.py,helpers/cleanup.py,helpers/packages.py. Re-export fromhelpers/__init__.pyfor backward compat. Do this incrementally during feature work, not as a dedicated refactor sprint.
- Recommended action: Decompose into
-
UXP migration has 5 months remaining. CEP end-of-life is approximately September 2026. The roadmap states UXP is at ~85% feature parity (Wave 3B). The UXP panel (
extension/com.opencut.uxp/) has 7 tabs vs. CEP's 8, and the UXP main.js is 1,523 lines vs. CEP's 7,730 — indicating significant feature gaps in the frontend. No UXP-specific tests exist in CI. The CEP panel continues to receive features (v1.14.0 version bumps touch CEP files), violating the roadmap's "freeze CEP feature additions" directive.- Recommended action: Audit UXP vs. CEP parity at the feature level (not tab level). Add UXP smoke test to CI. Enforce CEP freeze — new frontend features go to UXP only.
-
No type checking in CI. 523 Python files with no mypy or pyright enforcement. Type errors (None where str expected, dict where dataclass expected, wrong callback signature) are caught at runtime — if at all. The
on_progresscallback pattern is already documented in CLAUDE.md as a gotcha (core modules call with 1 arg, routes define closures with 2 args), which is exactly the class of bug static typing catches.- Recommended action: Add
mypy --ignore-missing-imports opencut/to CI. Start with--no-strictand fix errors incrementally. Target: 0 type errors inopencut/core/within 2 sprints.
- Recommended action: Add
-
Untracked subprocesses can orphan on cancel. The
@async_jobdecorator registers the job's main thread for cancellation, and_register_job_process()tracks Popen handles. But 158 subprocess calls across core modules callsubprocess.run()directly — these finish synchronously within the job thread but can't be interrupted mid-execution. If a user cancels a job while FFmpeg is mid-render (a 30-minute operation), the FFmpeg process runs to completion even though the job is marked cancelled. The process exit code is then silently discarded.- Recommended action: Wrap long-running subprocess calls in a pattern that checks
job_cancelledflag and sends SIGTERM to the child process. Alternatively, refactorrun_ffmpeg()in helpers.py to accept ajob_idparameter and auto-register the Popen for cancellation.
- Recommended action: Wrap long-running subprocess calls in a pattern that checks
-
No security scanning in CI pipeline. The
build.ymlworkflow runs ruff lint and pytest but has no security tooling: no bandit (Python security linter), no CodeQL (GitHub's code scanning), no dependabot/Snyk (dependency vulnerability scanning), no SBOM generation. For a project that executes FFmpeg subprocesses, runspip installat runtime viasafe_pip_install(), and loads ML models from external sources, this is a meaningful gap.- Recommended action: Add
bandit -r opencut/ -llto CI (catches high-confidence security issues). Enable GitHub Dependabot for dependency alerts (zero-effort, just adddependabot.yml). Add CodeQL for deeper analysis.
- Recommended action: Add
-
Temp file accumulation under load. 93 modules create temp files via
tempfile.mkstemp()orNamedTemporaryFile(). The deferred cleanup mechanism (_schedule_temp_cleanup()in helpers.py) uses a 5-second delay with 3 retries. Under concurrent load (10 video processing jobs), this means hundreds of multi-GB temp files (intermediate FFmpeg outputs, extracted frames, model outputs) can accumulate before cleanup fires. No disk quota, no max-temp-size check, no cleanup-on-startup sweep.- Recommended action: Add a startup sweep of
tempfile.gettempdir()for staleopencut_*temp files. Add a periodic (60s) background cleanup for files older than 10 minutes. Log temp disk usage in/system/status.
- Recommended action: Add a startup sweep of
-
25+ tests use
time.sleep()creating flaky CI. Tests intest_batch_executor.py,test_batch_parallel.py,test_boolean_coercion.py,test_integration_ffmpeg.py, andtest_preview_realtime.pycontain sleeps ranging from 10ms to 500ms. These are timing-dependent and will intermittently fail on slow CI runners, Windows VMs, or under load. Additionally,test_solver_agent.pyusesrandom.seed(42)but other tests don't seed, introducing non-determinism.- Recommended action: Replace
time.sleep()in tests with event-based synchronization (threading.Event, condition variables). For async result tests, poll with timeout rather than fixed sleep. Audit and seed all random usage.
- Recommended action: Replace
-
No auto-generated API documentation. With 1,088 routes across 83 blueprints, there is no OpenAPI/Swagger spec, no auto-generated endpoint catalog, and no machine-readable API schema. Plugin developers and external integrators must read route source code. The roadmap's Wave 3C notes FastAPI migration (which brings auto-generated OpenAPI) but defers it. The original trigger — "if >300 routes" — was passed long ago.
- Recommended action: Generate an OpenAPI spec from Flask routes using
flask-smorestorapispecwithout migrating to FastAPI. Serve Swagger UI at/api/docsfor development mode only. This is a 1-day effort that unlocks plugin ecosystem development.
- Recommended action: Generate an OpenAPI spec from Flask routes using
-
Blueprint registration is sequential and eager.
register_blueprints()inroutes/__init__.pyperforms 83 sequentialimportstatements at app startup. Each import may trigger module-level initialization (cache setup, constant computation, availability checks). Measured impact is 2-5 seconds on startup — not a production issue but noticeable during development when the server auto-restarts on file changes.- Recommended action: No immediate action needed. If dev-cycle time becomes a complaint, implement lazy blueprint registration (register on first request to URL prefix).
-
No performance regression detection. No benchmarks, no load tests, no response-time tracking in CI. With 1,088 routes and 408 modules, a single change to
helpers.pyorjobs.pycould degrade performance across hundreds of endpoints with no visibility.- Recommended action: Add a simple benchmark suite (10 representative endpoints, measure p50/p95 response time) that runs in CI and fails on >20% regression. Use
pytest-benchmarkor custom timing.
- Recommended action: Add a simple benchmark suite (10 representative endpoints, measure p50/p95 response time) that runs in CI and fails on >20% regression. Use
-
Missing production governance files. No
SECURITY.md(vulnerability disclosure process), noCODE_OF_CONDUCT.md, noCONTRIBUTING.mdwith architecture guide, no SBOM (software bill of materials). For an open-source project with 408 modules and ML model downloads, these are expected by enterprise adopters.- Recommended action: Add
SECURITY.mdwith disclosure process and supported-versions table. Generate SBOM frompyproject.tomldeps.
- Recommended action: Add
-
FastAPI migration trigger has been reached. The roadmap defers FastAPI migration until ">300 routes" with the rationale that validation boilerplate would become unmanageable. Current state: 1,088 routes, 879 mutation endpoints, manual
safe_float()/safe_int()/safe_bool()validation in every handler. Pydantic models would eliminate ~60% of per-route validation boilerplate and provide automatic request/response schema generation.- Recommended action: This remains low priority because Flask works and migration risk is high with 83 blueprints. However, the original deferral rationale no longer holds. If a major refactor is planned (e.g., helpers.py decomposition), consider migrating 1-2 blueprints to FastAPI as a proof-of-concept to measure the cost/benefit.
| Finding | Priority | Effort | Impact | Status |
|---|---|---|---|---|
| GPU process isolation (Wave 3A) | HIGH | XL | Eliminates OOM crashes | Not started |
| Rate limiting expansion | HIGH | M | Prevents DoS / resource exhaustion | 4% coverage |
| Test depth & coverage threshold | HIGH | L | Catches regressions before release | 50% threshold |
| Roadmap rebase to v1.14.0 | HIGH | S | Accurate planning | Stale since v1.9.26 |
| helpers.py decomposition | MEDIUM | M | Reduces coupling, merge conflicts | 350 imports |
| UXP full parity (Wave 3B) | MEDIUM | L | CEP EOL Sept 2026 | ~85% parity |
| Type checking in CI | MEDIUM | M | Catches type bugs statically | Not started |
| Subprocess cancellation | MEDIUM | M | Clean job cancel behavior | 158 untracked calls |
| Security scanning in CI | MEDIUM | S | Catches vulnerabilities | Not started |
| Temp file disk management | MEDIUM | S | Prevents disk exhaustion | No quota |
| Flaky test elimination | MEDIUM | S | Reliable CI | 25+ sleep-based tests |
| Auto-generated API docs | LOW | S | Enables plugin ecosystem | No spec exists |
| Performance benchmarks in CI | LOW | M | Detects regressions | Not started |
| Production governance files | LOW | S | Enterprise readiness | Missing |
| FastAPI migration evaluation | LOW | XL | Reduces boilerplate at scale | Deferred |