CLI and TUI for tracking AI token usage and cost — and for understanding how you actually work. Reads session transcripts directly from disk — no API keys, no proxy, no wrapper. Supports Claude Code and Codex CLI.
pipx install tokenscapeZero-install run:
uvx tokenscapeLocal install from source:
pip install -e .
pip install -e ".[semantic]" # with semantic clusteringOptional semantic clustering (PyPI):
pip install "tokenscape[semantic]"
# adds: fastembed (~150MB), scikit-learn, numpy
# downloads on first use: BAAI/bge-small-en-v1.5 (~130MB, cached in ~/.cache/fastembed)By default, tokenscape reads Claude Code sessions. Pass --tool codex to analyze Codex CLI sessions instead. The flag applies to all commands and the TUI.
tokenscape --tool codex today
tokenscape --tool codex report -p 30days
tokenscape --tool codex full-report --html-output report.html
tokenscape --tool codex # TUI in Codex mode| Tool | Session path |
|---|---|
claude (default) |
~/.claude/projects/ + ~/Library/Application Support/Claude/local-agent-mode-sessions/ |
codex |
~/.codex/sessions/ |
tokenscape # interactive TUI dashboard (default 7-day window)
tokenscape today # today's tokens
tokenscape month # this month's tokens
tokenscape report -p 30days
tokenscape report --from 2026-05-01 --to 2026-05-29
tokenscape report --refresh 60 # auto-refresh every 60s
tokenscape status # one-liner: today + month
tokenscape status --format json
tokenscape project # drill-down by project (interactive)
tokenscape project myproject # drill-down for a specific project
tokenscape export # CSV export (today / 7d / 30d)
tokenscape export -f json
tokenscape bundle # create a zip of session data to share with teammates
tokenscape full-report # full markdown report to stdout (all commands in one pass)
tokenscape full-report --source teammate-bundle-20260531.zip
tokenscape full-report --top 10 --output report.md
tokenscape full-report --html-output report.html # interactive HTML report
tokenscape full-report --output report.md --html-output report.html # both at once
tokenscape full-report --summarize --output report.md # append AI Insights section via LLM
tokenscape full-report --summarize --force-new # bypass cached summary
tokenscape patterns # shell automation candidates + hottest files + user prompt patterns
tokenscape workflow # activity transition sequences + session ramp time
tokenscape growth # per-project efficiency gaps
tokenscape models # model usage breakdown by activity + efficiency signals
tokenscape semantic # intent clustering of user prompts (requires [semantic] extra)Launch with tokenscape (no arguments).
Today [ 7 Days ] 30 Days Month
tokenscape [CLAUDE] 7 Days
92.4M total 1,330 turns 96.1% cache hit
5.9k in 896.6k out 89.3M cached 2.3M written
┌─ Daily Activity ──┐ ┌─ By Project ──────┐
│ ████▁▁ 2026-05-23 │ │ ████ tokenscape │
│ ███▁▁▁ 2026-05-24 │ │ ██▁▁ orchid │
│ ... │ │ ... │
└───────────────────┘ └───────────────────┘
┌─ By Activity ─────┐ ┌─ By Model ────────┐
│ Coding │ │ Sonnet 4.6 │
│ Exploration │ │ Haiku 4.5 │
│ ... │ │ ... │
└───────────────────┘ └───────────────────┘
┌─ Workflow Transitions ────────────────────┐ ┌─ Growth Signals ──────────────────────────┐
│ ████ Coding → Debugging 47 18% │ │ ! tokenscape debug/test ratio 4.2× ... │
│ ███ Debugging → Coding 38 15% │ │ ! orchid conversation 62% ... │
│ ... │ │ │
│ ramp mean 3.1 p90 11 (42 sessions) │ │ │
└──────────────────────────────────────────┘ └───────────────────────────────────────────┘
Keyboard shortcuts:
| Key | Action |
|---|---|
← / → |
Cycle periods |
1 |
Today |
2 |
7 Days |
3 |
30 Days |
4 |
Month |
t |
Toggle tool (CLAUDE ↔ CODEX) |
r |
Refresh |
q |
Quit |
The active tool is shown in the header as [CLAUDE] (blue) or [CODEX] (green). Switching with t reloads all panels immediately.
tokenscape projectLists recent projects sorted by last active date. Enter a number to select or type a search term to filter. Displays token usage broken down by day, activity, tool, shell command, and MCP server for the selected project.
tokenscape project orchid -p 7days
tokenscape project myapp --from 2026-05-01 --to 2026-05-15Beyond token cost, these commands answer: what do you repeatedly ask Claude to do, where do you spend the most time, and what does that reveal about your workflow?
| Command | Question answered | Time to run |
|---|---|---|
patterns |
What repeats at the mechanical level? | < 2s |
workflow |
How do sessions actually flow? | < 2s |
growth |
Which projects have process gaps? | < 2s |
models |
Which models do what, and is that efficient? | < 2s |
semantic |
What are my real recurring intents? | 5–30s first run, < 2s cached |
full-report |
All of the above as a single markdown document | < 5s (+ semantic if installed) |
full-report --html-output |
All of the above as an interactive HTML report | < 5s (+ semantic if installed) |
full-report --summarize |
All of the above + AI-generated insights section | depends on LLM |
A useful sequence: run semantic to find your top intent clusters, cross-reference with patterns to see the mechanical steps that accompany them, check growth for coverage or documentation gaps in those projects, and use workflow to see where in a session those patterns tend to occur.
What it answers: What does Claude do for me repeatedly, and what do I keep asking for?
tokenscape patterns # 30-day default
tokenscape patterns -p 7days
tokenscape patterns --min 5 # raise repetition threshold (default 3)Claude's top Bash operations — Every time Claude runs a shell command on your behalf, it's recorded. patterns counts these, normalized to the first two words, and surfaces the ones that repeat most. A command Claude runs 20+ times is a candidate for an alias, a Makefile target, or a CLAUDE.md shortcut so Claude stops reinventing it every session. High counts signal that Claude is doing the same mechanical step repeatedly — a sign the step could be encoded as a convention rather than re-derived each time.
Hottest files — Files Claude edits most frequently across sessions. A file appearing 15 times in 30 days is either a core module (expected) or a chronic trouble spot. Cross-reference with growth: if the same file appears in both hottest-files and your debug-heavy projects, it may warrant refactoring or better test coverage.
User prompt verbs, bigrams, and repeated prompts — These sections analyze what you typed, not what Claude ran. The leading verb of each prompt shows your dominant intent: add, fix, update, check. Bigrams surface recurring topic pairs after stopwords are removed. Exact repeated prompts are the highest-value automation targets — if you typed the same thing four times, it belongs in a slash command or CLAUDE.md workflow.
What it answers: How does my work actually flow within a session, and how long does it take me to get to productive work?
tokenscape workflow
tokenscape workflow -p 30daysActivity transitions — Each turn is classified into one of 13 activity types (Coding, Debugging, Exploration, Planning, etc.). workflow counts transitions between activities across all sessions. A high Exploration → Coding rate means you typically read before you write. A high Coding → Debugging rate means edits often need follow-up correction. These aren't judgements — they're a map of your actual process, which is the first step to changing it intentionally. Self-transitions are excluded; only cross-activity moves are shown.
Session ramp time — For each session, this counts how many turns elapsed before the first file edit. The mean and p90 tell you how much discovery and conversation overhead precedes actual changes. A high mean (10+ turns) may indicate that context being re-derived from scratch each session could instead be pre-loaded via CLAUDE.md or project documentation.
What it answers: Where are the gaps in my process, by project?
tokenscape growth
tokenscape growth -p 30daysDebug-to-test ratio — Compares turns classified as Debugging against turns classified as Testing within each project. A ratio above 3×, or zero test turns alongside repeated debugging, flags a project where bugs are being found reactively rather than caught proactively. This doesn't tell you how to add tests — it tells you which project most needs them.
Conversation ratio — The fraction of turns that are pure conversation (no tools used). High conversation is normal for design and review, but a project staying above 40% across many sessions often means unclear requirements regenerating the same discussion, missing documentation being reconstructed repeatedly, or architectural uncertainty that hasn't been resolved. Projects with fewer than five turns in the period are excluded to avoid noise.
What it answers: Which models are being used for what, and is that a good match?
tokenscape models
tokenscape models -p 7days
tokenscape models --by-project # add per-project model breakdownModel × activity breakdown — For each model in the period, shows total turns, total tokens, average tokens per turn, and the top three activity categories by share of turns. A model averaging 2k tokens/turn on Conversation is a different story than one averaging 80k tokens/turn on Feature Dev.
Efficiency signals — Flags any model where more than 30% of its turns fell into low-value activity categories (Conversation, Git Ops, General, Delegation). This isn't always actionable (Claude Code picks the model, not you), but it surfaces patterns worth knowing — e.g. Opus spending most of its turns on pure conversation turns that Sonnet handles equally well.
By project (--by-project) — Adds a second table showing model usage broken down per project: which model each project used, how many turns, total tokens, and top two activities. Useful for spotting a project that consistently pulls in a more expensive model than others.
What it answers: What are the 6–10 recurring things I actually ask Claude to do?
Requires pip install "tokenscape[semantic]".
tokenscape semantic # 90-day default, k auto-selected
tokenscape semantic -p 90days # longer window = better clusters
tokenscape semantic -k 10 # override cluster count
tokenscape semantic --project orchid
tokenscape semantic --labels # generate 2-3 word labels via LLM (see below)patterns can tell you that you typed add 12 times and fix 8 times, but add a search endpoint, add rate limiting, and add pagination are three instances of the same intent — pure counting sees them as unrelated. semantic embeds every prompt you typed using a local neural model and groups them by meaning, not wording.
Each prompt is embedded with fastembed using BAAI/bge-small-en-v1.5 (33M parameters, ~130MB, fully offline). Embeddings are cached in ~/.cache/tokenscape/embeddings.npz so re-runs are fast. Clusters are computed with k-means; k is auto-selected as sqrt(n/2) capped at 20. Each cluster is represented by its three nearest-to-centroid real prompts — you see your own words, not a generated label.
Prompts under three words are excluded (they're confirmations, not intent). A large cluster (20%+ of prompts) mapping to a single task type is an automation candidate: a slash command, a CLAUDE.md workflow entry, or a custom tool. A cluster of documentation prompts that follows feature work suggests a step that could be triggered automatically.
LLM cluster labels (--labels) — Pass --labels to generate a 2–3 word label per cluster instead of inferring the theme from examples. Requires a config file at ~/.config/tokenscape/config.toml (respects XDG_CONFIG_HOME). Supports any OpenAI-compatible endpoint:
[provider]
base_url = "http://localhost:11434/v1" # Ollama (non-thinking models only)
api_key = "ollama"
model = "llama3.2:latest"
# llama.cpp (recommended for thinking models — qwen3, deepseek-r1, etc.):
# base_url = "http://your-server/v1"
# api_key = "none"
# model = "Qwen3.6-35B-A3B-MXFP4_MOE.gguf"
# Anthropic:
# base_url = "https://api.anthropic.com/v1"
# api_key = "sk-ant-..."
# model = "claude-haiku-4-5-20251001"
# LM Studio / OpenRouter — any /v1/chat/completions endpointLabels are cached in ~/.cache/tokenscape/labels.json keyed by cluster content, so repeat runs with stable clusters make no API calls. If the config is absent or the API call fails, --labels silently falls back to showing example prompts.
What it answers: Everything, in one document.
tokenscape full-report # 30-day default, stdout
tokenscape full-report -p 7days # shorter window
tokenscape full-report --top 12 # more rows per table (default 8)
tokenscape full-report --labels # LLM cluster labels (requires config)
tokenscape full-report --output report.md # write markdown to file
tokenscape full-report --html-output report.html # write interactive HTML report
tokenscape full-report --output report.md --html-output report.html # both at once
tokenscape full-report --summarize # append AI Insights section (requires config)
tokenscape full-report --summarize --force-new # bypass cached summaryRuns all analysis in a single pass. Emits a markdown document (or interactive HTML report, or both) covering: summary, projects, workflow transitions + session ramp, growth signals, model efficiency + by-project model breakdown, patterns (shell commands, hottest files, prompt verbs, bigrams), and intent clusters if [semantic] is installed. Status/warning messages go to stderr so tokenscape full-report > report.md works cleanly.
HTML report (--html-output) — Generates a self-contained interactive HTML file alongside (or instead of) the markdown report. Includes all the same sections rendered with sortable tables, stacked activity-bar charts, and Chart.js bar charts for project token/turn breakdowns and model activity mix. Features a collapsible sidebar with scroll-aware navigation and a dark/light mode toggle. No server required — open the file directly in any browser.
AI Insights (--summarize) — Appends a ## AI Insights section written by an LLM, covering four areas: usage patterns, token efficiency, model selection, and recommended actions. The LLM receives a structured JSON summary of the report data (no raw prompts) and responds with a 200–300 word analysis referencing your actual numbers. Requires the same [provider] config as --labels. Results are cached in ~/.cache/tokenscape/summaries.json keyed by model + report data — re-runs with the same data make no API call. Use --force-new to bypass the cache (e.g. after switching models).
Thinking models — If your LLM endpoint serves a reasoning/thinking model (qwen3, deepseek-r1, etc.), it must suppress thinking tokens or they exhaust the token budget before writing the answer. llama.cpp supports this via chat_template_kwargs:
# ~/.config/tokenscape/config.toml
[provider]
base_url = "http://your-llamacpp-server/v1"
api_key = "none"
model = "Qwen3.6-35B-A3B-MXFP4_MOE.gguf"tokenscape automatically passes {"enable_thinking": false} via chat_template_kwargs to llama.cpp. Ollama does not reliably honor this flag via its OpenAI-compatible endpoint — use llama.cpp or a non-thinking model with Ollama.
Sharing with teammates — use tokenscape bundle to create a zip of your session data, then teammates run full-report --source against it on their own machine. No server required.
# On your machine:
tokenscape bundle # → tokenscape-bundle-20260531.zip
# On a teammate's machine:
tokenscape full-report --source tokenscape-bundle-20260531.zip --output report.mdThe bundle contains only .jsonl session files — no credentials, settings, or other config. It does contain your full prompt history; only share with people you trust. For individual commands, point at the extracted directory via CLAUDE_CONFIG_DIR=/path/to/extracted tokenscape patterns.
If you use Claude Code across multiple machines, run bundle on each one and merge the zips before analysis.
# On each machine:
tokenscape bundle # → tokenscape-bundle-YYYYMMDD.zipTransfer all zips to one machine, then:
mkdir merged
for zip in machine1.zip machine2.zip machine3.zip; do
unzip -o "$zip" -d merged/
done
tokenscape full-report --source merged/Works because session files are named by UUID — no collisions when merging. Any tokenscape command that accepts --source works against the merged directory.
All reports show four token types plus total:
| Type | Description |
|---|---|
| Input | Tokens not served from cache |
| Output | Generated tokens |
| Cache read | Tokens read from prompt cache |
| Cache write | Tokens written to prompt cache |
Grouped by: day, project, model, activity, tool, shell command, MCP server.
Each turn is classified into one of 13 categories:
| Category | Trigger |
|---|---|
| Coding | Edit, Write, apply_patch, or write_file used |
| Feature Dev | add/create/implement keywords + edits |
| Refactoring | refactor/rename/simplify + edits |
| Debugging | error/fix/bug keywords + tool use |
| Testing | pytest/jest/go test in shell |
| Exploration | Read/Grep/Glob/WebSearch/read_file/search_files only |
| Planning | EnterPlanMode / TaskCreate tools |
| Delegation | Agent / Task tool spawn |
| Git Ops | git push/commit/merge in shell |
| Build/Deploy | docker/npm build/pip install in shell |
| Brainstorming | brainstorm/design keywords, no edits |
| Conversation | No tools, pure text |
| General | Skill tool or uncategorized |
Codex tool names (exec_command, apply_patch, write_file, read_file, search_files) map into the same categories as their Claude equivalents.
Claude Code (--tool claude, default)
~/.claude/projects/<sanitized-cwd>/<session-id>.jsonl- Override base path with
CLAUDE_CONFIG_DIRenv var - macOS:
~/Library/Application Support/Claude/local-agent-mode-sessions/
Turns deduplicated by message.id. Skill body injections, task notifications, system XML blocks, and terminal output pasted as prompts are stripped from user text before analysis.
Codex CLI (--tool codex)
~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl
Token counts are accumulated from per-API-call last_token_usage deltas within each task turn (bounded by task_started / task_complete events). Codex sessions have no cache_write equivalent — that column always shows 0. Tool names (exec_command, apply_patch, etc.) map into the same activity classification used for Claude.
Date filtering is per-entry timestamp for both tools, so sessions spanning midnight are bucketed correctly.
git clone https://github.com/scheidydude/tokenscape
cd tokenscape
uv venv .venv --python 3.11
uv pip install -e .
pytest
tokenscape statusOptional extras for semantic clustering:
uv pip install -e ".[semantic]"
tokenscape semanticRequires Python 3.11+.