🦞 OpenClaw Zoom Agent

AI voice agent that joins Zoom meetings via PSTN dial-in. Listens, transcribes, and speaks with AI-generated responses.

How It Works

You → "Join my Zoom" → Agent dials Zoom PSTN → DTMF joins meeting
                      → Live transcription via Telnyx webhooks
                      → AI responses via GPT-4o-mini → Telnyx TTS

Quick Start

# 1. Install dependencies
npm install

# 2. Copy and fill in your credentials
cp .env.example .env

# 3. Join a meeting
node m3-voice-agent.js -m 83914076399 -p 953856

CLI Options

node m3-voice-agent.js [options]

  -m, --meeting-id <id>     Zoom meeting ID (required)
  -p, --passcode <code>     Meeting passcode
  -d <seconds>              Max duration (default: 600)

Environment Variables

Variable	Required	Description
`TELNYX_API_KEY`	✅	Telnyx API key
`TELNYX_DID`	✅	Your Telnyx phone number
`TELNYX_CONNECTION_ID`	✅	Call control application ID
`OPENAI_API_KEY`	✅	OpenAI API key (for GPT responses)
`ZOOM_DIAL_IN`		Zoom dial-in number (default: +16699009128)
`AGENT_NAME`		Display name (default: "AI Assistant")
`AGENT_ROLE`		Role description
`AGENT_INSTRUCTIONS`		Custom system prompt
`BUFFER_DELAY`		Ms to wait before responding (default: 1500)
`NO_SPEAK`		Set to "true" for listen-only mode
`TRANSCRIPT_FILE`		Path to save transcript after call
`USE_OPENCLAW_BRAIN`		Set to "true" to route responses via OpenClaw gateway
`OPENCLAW_GATEWAY`		OpenClaw gateway URL (default: http://localhost:18789)
`OPENCLAW_TOKEN`		OpenClaw API token (if auth enabled)

Architecture

Files

File	Purpose
`m3-voice-agent.js`	Main agent — full voice loop (transcribe + respond + speak)
`m2-live-transcribe.js`	Transcription-only mode (no AI responses)
`gemini-live-agent.js`	Gemini Live API integration (experimental — audio bridge WIP)
`bridge.js`	WebSocket bridge (requires public URL for media streaming)
`server.js`	Legacy Retell AI integration

Flow

Dial — Telnyx PSTN call to Zoom dial-in number
Join — DTMF sequence: meeting ID → skip participant ID → passcode
Tunnel — ngrok exposes local webhook server for Telnyx events
Transcribe — Telnyx real-time transcription (Engine B)
Think — OpenClaw brain (if enabled) or GPT-4o-mini generates response
Speak — Telnyx TTS speaks response into the call

Timing

~15s for Zoom IVR greeting
~13s for DTMF sequence (meeting ID + passcode)
~1.5s buffer before responding (configurable)
Total join time: ~30s

Requirements

Node.js 18+
ngrok installed and authenticated
Telnyx account with:
- A phone number (DID)
- Call control application (outbound channel limit ≥ 2)
OpenAI API key

Telnyx Setup

Create a Call Control Application
Set outbound channel limit to at least 2
Assign your phone number to the application
Note the connection ID — that's your TELNYX_CONNECTION_ID

The webhook URL is set dynamically at runtime via the API (no manual config needed).

Language Support

The agent responds in the same language the speaker uses:

English → English response
Chinese (Mandarin) → Chinese response
Mixed → dominant language

Telnyx TTS supports en-US and cmn-CN voices.

Limitations

Voice quality: Telnyx basic TTS (robotic). Upgrade path: OpenAI TTS → audio streaming.
Latency: ~2-4s round-trip (transcription + GPT + TTS). Reducible with streaming.
One call per instance: Run multiple instances for concurrent meetings.
PSTN only: No Zoom SDK integration (yet). Phone audio quality.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
worker		worker
.env.example		.env.example
.gitignore		.gitignore
DESIGN.md		DESIGN.md
README.md		README.md
STORY.md		STORY.md
bridge.js		bridge.js
cli.sh		cli.sh
gemini-hybrid-agent.js		gemini-hybrid-agent.js
gemini-live-agent.js		gemini-live-agent.js
m2-live-transcribe.js		m2-live-transcribe.js
m3-voice-agent.js		m3-voice-agent.js
package-lock.json		package-lock.json
package.json		package.json
render.yaml		render.yaml
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🦞 OpenClaw Zoom Agent

How It Works

Quick Start

CLI Options

Environment Variables

Architecture

Files

Flow

Timing

Requirements

Telnyx Setup

Language Support

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🦞 OpenClaw Zoom Agent

How It Works

Quick Start

CLI Options

Environment Variables

Architecture

Files

Flow

Timing

Requirements

Telnyx Setup

Language Support

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages