Skip to content

Latest commit

 

History

History
106 lines (75 loc) · 3.77 KB

File metadata and controls

106 lines (75 loc) · 3.77 KB

Adding a Custom Data Source

vox works with any data source Claude Code can access. Here's how to add your own.

Option 1: Build an MCP Server

If your data lives in an API (your own product, a CRM, a custom database), you can build an MCP server that exposes it to Claude Code.

MCP Server Basics

An MCP server exposes "tools" that Claude Code can call. For VoC research, useful tools include:

list_conversations    — Returns conversation records with metadata
get_transcript        — Returns full transcript for a conversation ID
search_conversations  — Searches across conversations by keyword/filter
get_customer          — Returns customer/account metadata
list_tickets          — Returns support tickets with filters

Building with the MCP SDK

# TypeScript
npx @anthropic/create-mcp-server my-data-source

# Python
pip install mcp

See the MCP documentation for full guides.

Key Design Principles for VoC Data Sources

  1. Return full text, not summaries. vox needs the raw language to do proper analysis. Summaries lose the emotional signal.
  2. Include metadata. Date, customer segment, customer size, plan tier, conversation type — all of these enable segmented analysis.
  3. Support filtering. Let the agent filter by date range, segment, topic, or conversation type. Without filtering, every query hits the entire dataset.
  4. Paginate large results. If you have thousands of records, return pages of 50-100 with a continuation token.
  5. Preserve speaker attribution. For transcripts, identify who said what. "Customer said X" is much more useful than "someone said X."

Registering Your MCP Server

Add to .claude/settings.json:

{
  "mcpServers": {
    "my-data-source": {
      "command": "node",
      "args": ["path/to/your/mcp-server/index.js"],
      "env": {
        "API_KEY": "your-key"
      }
    }
  }
}

Option 2: File Export Pipeline

If building an MCP server is overkill, set up a script that exports data to JSON/CSV files.

# Example: Export from your database to JSON
./export-conversations.sh --since "2025-01-01" --output data/conversations.json

# Example: Pull from an API
curl -H "Authorization: Bearer $TOKEN" \
  "https://api.yourproduct.com/conversations?since=2025-01-01" \
  > data/conversations.json

Schedule this to run regularly (cron, GitHub Actions, etc.) so your data stays fresh.

Option 3: Direct API Access via WebFetch

If your data source has a REST API, vox can call it directly using WebFetch. No MCP server needed, but you'll need to provide the API details in your research query:

"Analyze customer feedback from our API at https://api.example.com/feedback — use header Authorization: Bearer TOKEN"

This works for one-off analysis but isn't great for repeated use. Build an MCP server or file pipeline instead.

Testing Your Integration

After connecting a new data source:

  1. Run /voc test connection — vox will attempt to discover and list available data
  2. Check that it can retrieve full conversation text (not just metadata)
  3. Check that metadata (segment, date, etc.) comes through correctly
  4. Run a small analysis to verify the data quality is sufficient

Adding a Custom Analysis Framework

Beyond data sources, you can also add custom analysis frameworks:

  1. Create a new markdown file in /frameworks/your-framework.md
  2. Follow the same structure as existing frameworks:
    • What the framework is
    • When to use it
    • Step-by-step methodology
    • Common pitfalls
    • Connection to other frameworks
  3. Create a matching command in /.claude/commands/your-command.md
  4. Create a matching template in /templates/your-template.md

vox will automatically discover and use new frameworks when they're relevant to a research question.