Open-source, agent-native CLI for AI video. Fork-able, observable, reproducible. Drive it from Claude Code / Cursor / Codex. Get an mp4 in ~8 minutes.
Two API keys (OPENROUTER_API_KEY + ELEVENLABS_API_KEY), one CLI, a video pipeline you can ship to prod. Ralphy wires up image / video / vision / LLM (OpenRouter), voice + music (ElevenLabs), HTML+GSAP composition (HyperFrames), and a local async-job queue (bun + SQLite). Your agent drives it.
See what Ralphy makes → — 11 rendered outputs from real projects.
Cost: ~$8–12 per 30s video. Speed: ~8 min cold-start, ~25 min for a 10-batch. Engine: HyperFrames (HTML + GSAP, deterministic Puppeteer + FFmpeg render).
| Platform | Command |
|---|---|
| macOS (Homebrew) | brew install alecs5am/tap/ralphy |
| Linux / macOS (curl) | curl -fsSL https://raw.githubusercontent.com/alecs5am/ralphy/main/install.sh | sh |
| Windows (PowerShell) | irm https://raw.githubusercontent.com/alecs5am/ralphy/main/install.ps1 | iex |
| Cross-platform (npm) | npm install -g @alecs5am/ralphy |
All four ship the same binary.
ralphy setup # interactive wizard — paste the two API keys + install agent skill
ralphy doctor # verify env is greenExpected output:
✦ ralphy v1.0.0
▸ Dependencies ✓ bun ✓ ffmpeg
▸ API keys ✓ OPENROUTER_API_KEY ✓ ELEVENLABS_API_KEY
✓ ready
macOS Gatekeeper warning? You used the direct-download path. Brew / npm /
install.shbypass Gatekeeper automatically. If you hit it:xattr -d com.apple.quarantine /path/to/ralphyonce and you're done.Verify your install: every Release includes a
SHA256SUMSfile.shasum -a 256 -c SHA256SUMS(macOS / Linux) orGet-FileHash(Windows) confirms the binary matches.
# 1. Create a project
ralphy new "Spring espresso ad" --id espresso-001
# 2. Find a template by free-text utterance
ralphy template suggest "talking head rant about deadlines" -p✦ Query: "talking head rant about deadlines"
1. ✓ talking-head-rant ███████████░░░░░ 0.70 strong
2. ⚠ storytime ████████░░░░░░░░ 0.50 weak
3. ⚠ yap-talking-head ████████░░░░░░░░ 0.50 weak
# 3. Scaffold from the chosen template (assets auto-pull from companion repo)
ralphy template use talking-head-rant --id espresso-001
# 4. Cost-preview before spending a cent
ralphy generate image --project espresso-001 --slot scene-01-bg \
--prompt "studio packshot, white seamless, 50mm, photoreal" --dry-run
# 5. Render the project to mp4
ralphy render espresso-001That's it. Full CLI surface in docs/cli-surface.md and at ralphy.dev/docs.
What you actually get vs other ways to do this:
| Closed SaaS (Higgsfield, HeyGen, Captions) | Other OSS (ShortGPT, MoneyPrinterTurbo) | Ralphy | |
|---|---|---|---|
| Source | Closed | OSS (script-shaped) | Apache 2.0, fork-able |
| Agent surface | Their cloud agent | None | Local skills + playbooks; works in any agent |
| Models | Vendor lock-in | One model, hardcoded | Any OpenRouter model — Kling / Seedance / Veo / Sora / Nano-Banana |
| Cost transparency | Subscription black box | Free-but-you-DIY | --dry-run shows the bill before you spend |
| Reproducibility | Vibes | Vibes | Append-only genlogs + postmortems + templates-as-git |
| Quality gates | Best-effort | None | Refuse-not-warn: bad scene = no render |
| Reference grounding | None | None | Built-in research engine (ralphy research) + guideline library |
| Composer | Web canvas (theirs) | MoviePy / FFmpeg scripts | HyperFrames (HTML + GSAP) — versioned in git, tested in CI |
The hard rule that makes the rest work: ralphy <verb> is the only entry-point. No ad-hoc ffmpeg shell-outs, no direct provider fetches, no orphan scripts. Every model call lands in generations.jsonl, every cost in the rollup, every failure in the postmortem.
graph LR
A[Agent: Claude Code / Cursor / Codex] -->|playbooks| B[ralphy CLI]
B --> C[Provider router]
C --> D[OpenRouter<br/>Kling / Seedance / Veo / Sora / Nano-Banana]
C --> E[ElevenLabs<br/>TTS + Music]
B --> F[HyperFrames composer<br/>HTML + GSAP]
F --> G[mp4 via Puppeteer + FFmpeg]
B --> H[Project memory<br/>genlogs · postmortems · cost rollup]
B --> I[Templates + guidelines<br/>versioned in git]
5 agent roles (researcher / scenarist / art-director / editor / producer) routed via AGENTS.md. The router decides which playbook the agent reads before acting.
| Surface | Read when |
|---|---|
| Mintlify docs | Quickstart, concepts, cookbook, CLI reference (auto-gen). |
| Models page | Live model picks, prices, known pitfalls (Kling rotation bias, Seedance privacy filter, gpt-image concurrent cap=1, …). |
AGENTS.md |
First. Routing rules + the "read the playbook before acting" discipline. |
MODELS.md |
Before every model call. Claude's training is stale on model names. |
docs/playbooks/ |
Per-role instructions (researcher, scenarist, art-director, editor, producer). |
templates/CATEGORIES.md |
50+ vibe-references + vibe-style templates, by category. |
| GitHub Discussions | Q&A, Show & Tell, Tester feedback. |
git clone https://github.com/alecs5am/ralphy.git
cd ralphy && bun install
bun test # unit + integration (450+ tests)
bun run lint # typecheck + project lints (errors / help-examples / skills / agents-md / templates / cli-surface)
bun run docs:cli # regenerate docs-mintlify/reference/cli/
bun run build:bin # build cross-platform binariesA pre-commit hook runs the test suite. CI runs the same on push/PR.
PRs welcome — especially:
- New templates under
templates/<category>/<slug>/(5 categories: b2b-saas, dtc-commerce, creator-lifestyle, entertainment-viral, cinematic-narrative). - New model entries in
MODELS.mdwith real cost numbers + known pitfalls. - Bug fixes in
cli/lib/providers/. - New guidelines under
guidelines/<slug>/(image-prompt rules — tag-able from chat as@guideline:<slug>).
For non-trivial changes, open an issue first or start a discussion.
Apache 2.0. Use, fork, ship to prod — patent grant included.
Built with Claude Code, Bun, HyperFrames, OpenRouter, and ElevenLabs.