Skip to content

sgkul2000/cf-ai-dbagent

Repository files navigation

cf_ai_dbagent

A conversational AI assistant for PostgreSQL, deployed entirely on Cloudflare Workers. Ask questions in plain English and get back SQL, results, schema insights, performance analysis, and controlled write operations — all with no external AI API keys required.

Built for the Cloudflare 1111 Intern Program.


Features

  • Natural language to SQL — ask questions, get queries and results back in plain language
  • Schema exploration — list schemas, tables, columns, indexes, and relationships
  • Semantic schema search — "where is email stored?" finds the right tables via vector embeddings, even on large schemas
  • Query profiling — null rates, distinct counts, top values, and min/max/avg per column
  • Performance analysis — EXPLAIN ANALYZE output with index suggestions
  • Controlled writes — schema changes and data modifications are proposed first and only run after explicit user approval
  • Saved queries — name and reuse queries across sessions
  • Health checks — automated 24-hour database health monitoring
  • GitHub OAuth — optional authentication with allowlist support

Architecture

Two Cloudflare Workers communicate over HTTP using the Model Context Protocol (MCP):

Browser (React + useAgentChat)
        |  WebSocket
        v
cf-ai-dbagent  (Cloudflare Worker + Durable Object)
  DatabaseAgent
  - Workers AI: @cf/meta/llama-3.3-70b-instruct-fp8-fast
  - Vectorize: schema-embeddings (768d, cosine)
  - DO SQLite: query history, saved queries, schema cache
        |  HTTP (Streamable MCP)
        v
cf-ai-dbagent-mcp  (Cloudflare Worker + Durable Object)
  PgSchemaExplorerAgent
  - Hyperdrive -> PostgreSQL
  - 8 read-only tools + gated execute_write

Cloudflare services used

Service Purpose
Workers + Durable Objects Serverless compute and stateful sessions
Workers AI LLM inference — no external API keys needed
Vectorize Semantic schema search via embeddings
Hyperdrive PostgreSQL connection pooling at the edge
AI Gateway Request logging, caching, model fallback

Available tools

MCP tools (served by the MCP worker, called by the chat agent):

Tool Description
list_schemas List all user-defined schemas
explore_schema List tables in a schema with columns and relationships
describe_table Full details for a table — columns, indexes, foreign keys
find_relationships Traverse the FK graph
execute_query Run a SELECT statement and return results as JSON
explain_query EXPLAIN ANALYZE with interpretation
profile_table Null rates, distinct counts, top values per column
suggest_indexes Find missing indexes based on sequential scan stats
generate_sql Convert a natural language question to SQL
execute_write Propose a write operation — requires human approval before execution

Agent tools (run directly in the DatabaseAgent Durable Object):

Tool Description
search_schema Semantic search over embedded schema descriptions
save_query Save a named query to DO SQLite
list_saved_queries List all saved queries
run_saved_query Execute a saved query by name

Project structure

cf_ai_dbagent/
├── src/                    # Chat Agent (TypeScript, Cloudflare Workers)
│   ├── server.ts           # DatabaseAgent Durable Object, WebSocket routing, OAuth
│   ├── app.tsx             # React chat UI with SQL syntax highlighting
│   ├── client.tsx          # WebSocket client
│   ├── mcp/client.ts       # MCP client wrapper + write approval flow
│   ├── prompts/system.ts   # LLM system prompt
│   ├── tools/              # Agent-side tools (query management, schema search, health)
│   ├── vectorize/          # Schema embedding indexer
│   └── util/               # Zod schema sanitizer
├── mcp-server/             # MCP Server (JavaScript, Cloudflare Workers)
│   ├── src/
│   │   ├── worker.js       # Cloudflare Worker entry point
│   │   ├── mcp-server.js   # Tool registration
│   │   ├── db/             # PostgreSQL pool, introspection, safety checks
│   │   ├── tools/          # Schema, query, profiling, optimization tools
│   │   └── nlsql/          # Natural language to SQL
│   └── benchmark/          # Performance benchmarks
├── eval/                   # Evaluation harness
├── wrangler.jsonc          # Chat Agent Cloudflare config
├── vite.config.ts          # Frontend build config
├── tailwind.config.ts      # Tailwind CSS
└── PROMPTS.md              # Prompt engineering notes and iteration history

Setup

Prerequisites

  • Node.js 20+
  • Wrangler CLI: npm install -g wrangler
  • Cloudflare account with Workers paid plan (required for Durable Objects)
  • A PostgreSQL database (Neon, Supabase, RDS, self-hosted, etc.)

1. Deploy the MCP server

cd mcp-server
npm install

# Create a Hyperdrive config pointing at your Postgres instance
wrangler hyperdrive create cf-ai-dbagent-pg \
  --connection-string "postgres://user:pass@host:5432/dbname"

# Paste the output Hyperdrive ID into mcp-server/wrangler.jsonc
# Under [[hyperdrive]], set id = "<your-hyperdrive-id>"

# Deploy
wrangler deploy
# Note the deployed URL: https://cf-ai-dbagent-mcp.<subdomain>.workers.dev

To allow write operations (disabled by default), uncomment PG_WRITE_ENABLED = "true" in mcp-server/wrangler.jsonc before deploying.

2. Create the Vectorize index

wrangler vectorize create schema-embeddings --dimensions=768 --metric=cosine

3. Deploy the Chat Agent

cd ..  # back to repo root
npm install

# In wrangler.jsonc, update MCP_SERVER_URL with your MCP server's deployed URL:
# "MCP_SERVER_URL": "https://cf-ai-dbagent-mcp.<subdomain>.workers.dev/mcp"

npm run build   # builds the React frontend with Vite
wrangler deploy

Optional: GitHub OAuth

To require authentication before accessing the agent, set the following secrets:

wrangler secret put AUTH_SECRET          # any random secret string
wrangler secret put GITHUB_CLIENT_SECRET # from your GitHub OAuth app

And set in wrangler.jsonc:

"GITHUB_CLIENT_ID": "<your-client-id>"

To restrict access to specific GitHub accounts, also set:

wrangler secret put GITHUB_ALLOWED_LOGINS  # comma-separated GitHub usernames

Local development

Run both workers simultaneously in two terminals:

# Terminal 1 — MCP server (uses .dev.vars for Postgres credentials)
cd mcp-server && wrangler dev --port 8788

# Terminal 2 — Chat Agent
wrangler dev --port 8787

Open http://localhost:8787 in your browser. The chat agent defaults to http://localhost:8788/mcp when running locally.

Local Postgres credentials (MCP server)

Create mcp-server/.dev.vars with your local database connection:

PG_HOST=localhost
PG_PORT=5432
PG_USER=your_user
PG_PASSWORD=your_password
PG_DATABASE=your_database
PG_SCHEMA=public

Connecting a different Postgres database

Create a new Hyperdrive config and update mcp-server/wrangler.jsonc:

wrangler hyperdrive create cf-ai-dbagent-pg \
  --connection-string "postgres://user:password@host:5432/database"

Hyperdrive supports any Postgres-compatible database: Neon, Supabase, PlanetScale Postgres, Aurora, self-hosted, etc.


Evaluation

cd eval
npm install
npx tsx runner.ts
# Results written to eval/results/

Tech stack

Layer Technology
Frontend React 19, TypeScript, Vite, TailwindCSS
Chat Agent Cloudflare Workers, Durable Objects, TypeScript
LLM Vercel AI SDK (ai), Workers AI (Llama 3.3 70B)
Agent framework agents SDK
MCP server Cloudflare Workers, Durable Objects, JavaScript
Protocol Model Context Protocol (MCP, streamable HTTP)
Database driver pg (node-postgres) via Hyperdrive
Semantic search Cloudflare Vectorize (768d cosine)
Validation Zod

About

A conversational AI assistant for PostgreSQL databases, deployed entirely on Cloudflare Workers. Users interact with their database in plain English and the agent handles schema exploration, query generation, performance analysis and controlled write operations with no external AI API Keys.

Resources

Stars

Watchers

Forks

Contributors