OpenGIKAI — Opening Up Parliament

open-gikai.net | 🇯🇵 日本語版はこちら / Japanese

OpenGIKAI (議会) is an open-source public media project that transforms Japanese parliamentary proceedings into a modern, accessible thread format — like social media, but with official sources. It ingests multiple sources including Diet records (NDL), Prime Minister press conferences (kantei.go.jp), and government council meeting minutes (審議会).

What It Does

Fetches official transcripts from multiple sources: NDL Diet Records API, kantei.go.jp press conferences, and government council meeting minutes
Uses AI (Claude) to summarize and structure speeches by topic
Links each thread to related news articles with image previews (Bing News)
Presents them in a thread-based UI with three reading levels:
- 🌱 Easy — Simple language for everyone
- 📖 Standard — Balanced detail with brief explanations
- 📰 Detailed — Full political context, news-style

Why

Parliamentary records are public but hard to read. OpenGIKAI makes them accessible without editorializing — every summary links back to the original transcript. The AI prompts and processing logic are fully open-source to ensure transparency and political neutrality.

Tech Stack

Layer	Technology
Frontend	Next.js 16 (App Router), TypeScript, Tailwind CSS
Deployment	Vercel — two projects from one repo: SSG frontend at the root, dynamic MCP server at `apps/mcp/`
Data Pipeline	Python + Claude API (Message Batches API + prompt caching)
Data Sources	NDL Diet Records API, kantei.go.jp, cao.go.jp councils
Public API	Read-only MCP server for Claude Desktop / Cline / custom agents

Getting Started

# Clone the repository
git clone https://github.com/wharfe/open-gikai.git
cd open-gikai

# Install frontend dependencies
npm install

# Start the frontend dev server
npm run dev

The MCP server is a separate Next.js project under apps/mcp/ with its own dependencies:

cd apps/mcp
npm install
npm run dev   # serves on http://localhost:3100

See apps/mcp/README.md for MCP deployment details.

Project Structure

├── src/                  # Frontend (Next.js SSG — output: "export")
│   ├── app/              # App Router pages
│   ├── components/       # React components
│   ├── lib/              # Utilities and data fetching
│   └── types/            # TypeScript type definitions
├── apps/
│   └── mcp/              # MCP server (separate Vercel project, dynamic Node runtime)
├── scripts/              # Python batch processing
│   ├── sources/          # Source adapters (NDL, kantei, council, …)
│   └── pipeline/         # AI pipeline (grouping, summarization, news ranker)
├── data/                 # Generated JSON consumed by both frontend SSG and MCP server
│   ├── threads/          # Per-date thread files
│   └── members.json      # Accumulated Diet member registry
├── public/               # Static assets (incl. sitemap, RSS feed)
└── .github/workflows/    # daily-batch.yml (6:00 AM JST cron)

The frontend uses output: "export", so anything requiring a Node runtime (Route Handlers, dynamic APIs) lives under apps/.

How It Works

Sources (NDL, kantei, council)
   ├─► fetch (sliding 30-day window per run)
   │
   ├─► group by topic                  ┐
   ├─► classify tension                │  Claude API
   ├─► summarize at 3 levels (Batches) │  + prompt caching
   ├─► extract commitments / outcomes  ┘
   │
   ├─► enrich with related news (Bing News + Claude relevance ranker)
   │
   └─► generate JSON
         ├─► frontend SSG  → open-gikai.net (Vercel)
         └─► MCP server     → /api/mcp       (Vercel, apps/mcp)

Sliding-window fetch: Each run re-fetches the last 30 days from every source. NDL publishes transcripts with a multi-day to multi-week lag, so a yesterday-only fetch silently misses retroactively-published meetings.
AI processing:
- Grouping — sync call per meeting; clusters speeches into thematic threads.
- Summarization — Message Batches API per thread (50% cost discount, stackable with prompt caching for ~90% input savings on the cached prefix).
- Outcome extraction — sync call per meeting; reads procedural speeches for vote results / attached resolutions.
News enrichment: Searches Bing News by topic, then a Claude Haiku ranker (scripts/pipeline/news_ranker.py) picks the most-relevant 3 articles from the candidate pool. Auxiliary information layer — see CLAUDE.md "Summary Layer Invariants" for the boundary.
Static generation: data/threads/*.json and data/members.json are consumed by the Next.js SSG to produce static HTML pages.
Deployment: Two Vercel projects pointing at the same repo — root (SSG frontend) and apps/mcp (dynamic MCP server).
Monitoring: Daily batch commits include (+N threads) in the message; the workflow emits a CI warning when 7+ consecutive runs add 0 threads (catches fetcher regressions that the green checkmark alone wouldn't). A hard job failure (e.g. NDL API 403 from a datacenter IP, Anthropic credit exhaustion) opens or updates a pipeline-failure GitHub Issue so it surfaces without polling gh run list.
- Batch resume: A summary batch that exceeds the per-run poll budget is no longer cancelled. Its id + a grouping manifest (with per-thread input_hash) are persisted to a committed sidecar at data/pending-batches/{date}.json and resumed on the next run, which re-fetches raw, verifies the hash, and assembles without re-grouping. A batch stuck in-flight for >2 days, or one that fails 3 runs in a row, opens/updates the same pipeline-failure Issue.

Data Pipeline

# 1. Fetch speeches across a sliding window (30 days catches retroactive NDL uploads)
python scripts/fetch_ndl.py     --lookback-days 30
python scripts/fetch_kantei.py  --lookback-days 30
python scripts/fetch_council.py --lookback-days 30 --council kisei
# (... repeat per council, see daily-batch.yml for the full list)

# 2. Summarize via Message Batches API (auto-resumes against existing data/threads/)
#    Requires ANTHROPIC_API_KEY in .env.
python scripts/summarize.py --date 2026-04-22 --batch

# 3. Enrich with news, filtered through Claude relevance ranker
python scripts/enrich-news.py --date 2026-04-22 --rank-with-claude

# 4. Generate sitemaps, feeds, and validation; build the SSG
node scripts/validate-data.mjs --fix
node scripts/generate-feeds.js
node scripts/generate-sitemap.mjs
npm run build && npx serve out

See .env.example for configuration. The daily batch workflow at .github/workflows/daily-batch.yml runs all of these in sequence at 6:00 JST.

MCP Server

OpenGIKAI exposes its dataset as a read-only Model Context Protocol server so Claude Desktop, Cline, or any MCP-capable agent can query Diet discussions directly.

Tool	Purpose
`search_threads`	Find threads by keyword, date range, committee, or source
`get_thread`	Fetch a full thread with 3-level summaries, original quotes, tension classifications, and outcomes
`get_member`	Diet member profile
`list_members`	Paginate members, filter by name/party
`list_dates`	Index of dates with available threads

The server lives in apps/mcp/ and is deployed as a second Vercel project from the same repository. OpenGIKAI does NOT pay for LLM inference — the MCP client (Claude Desktop, etc.) calls Claude with its own API key, and the server only returns JSON. See apps/mcp/README.md for the endpoint URL and Claude Desktop configuration snippet.

Design Principles

Political neutrality by design — All speeches processed with identical algorithms. No editorial selection. Prompts are open-source. See "Summary Layer Invariants" in CLAUDE.md for the non-negotiable rules (stateless, deterministic, prompt-only) and the boundary between the summary layer and auxiliary layers (news enrichment, MCP server) where LLM/agent patterns are allowed.
Source transparency — Every summary links to the original NDL/kantei/council transcript.
AI transparency — All AI-generated content is clearly labeled. MCP responses include an attribution block making this explicit for downstream agents.
Accessibility — Three reading levels make Diet proceedings approachable for everyone.

Data Source

Diet records are sourced from the National Diet Library's Diet Records Search System. These records are not subject to copyright under Japan's Copyright Act, Article 13. Press conference transcripts are sourced from kantei.go.jp. Government council meeting minutes (審議会) are sourced from cao.go.jp and other ministry websites.

AI-generated summaries are clearly attributed as such. Related news articles are linked from Bing News RSS; only URLs, source names, publication dates, and OGP preview images are displayed — no article content is reproduced.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.github		.github
apps/mcp		apps/mcp
assets/fonts		assets/fonts
data		data
docs/superpowers		docs/superpowers
e2e		e2e
public		public
scripts		scripts
src		src
tests/unit		tests/unit
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.ja.md		CONTRIBUTING.ja.md
CONTRIBUTING.md		CONTRIBUTING.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
README.ja.md		README.ja.md
README.md		README.md
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenGIKAI — Opening Up Parliament

What It Does

Why

Tech Stack

Getting Started

Project Structure

How It Works

Data Pipeline

MCP Server

Design Principles

Data Source

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenGIKAI — Opening Up Parliament

What It Does

Why

Tech Stack

Getting Started

Project Structure

How It Works

Data Pipeline

MCP Server

Design Principles

Data Source

Contributing

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages