Caltrain Bot is a Telegram bot that answers Caltrain schedule questions in plain English. It loads static GTFS data into SQLite and uses DSPy plus an LLM provider to turn questions like "When is the next train from Palo Alto to San Francisco around 8am?" into schedule lookups.
Try the live bot on Telegram: @CalTrain
Read the build story: Part 1, Part 2, Part 3
- Answers Caltrain schedule questions from natural language.
- Uses bundled Caltrain GTFS data, so schedule lookups come from a static transit feed rather than live scraping.
- Supports both local LLMs through Ollama and hosted models through OpenRouter.
- Runs as a Telegram bot today, with the schedule and question-analysis logic separated enough to support other chat frontends later.
Example questions:
- "When is the next train from Mountain View to San Francisco?"
- "Are there any trains from Palo Alto to San Jose after 6pm?"
- "What trains leave Millbrae around 8 in the morning?"
- Python 3.11+
uv- A Telegram bot token
- One supported LLM provider: Ollama or OpenRouter
uv syncThe bot reads configuration from your shell and from a .env file if present.
Required variables:
TELEGRAM_BOT_TOKENLLM_PROVIDERset toollamaoropenrouter
If LLM_PROVIDER=ollama, also set:
OLLAMA_API_BASEOLLAMA_MODEL
If LLM_PROVIDER=openrouter, also set:
OPENROUTER_API_KEYOPENROUTER_MODEL
Example .env for OpenRouter:
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=your-openrouter-api-key
OPENROUTER_MODEL=openai/gpt-4.1-miniExample .env for Ollama:
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
LLM_PROVIDER=ollama
OLLAMA_API_BASE=http://127.0.0.1:11434
OLLAMA_MODEL=llama3.2uv run caltrain-botRun the test suite:
uv run pytestRun linting:
uv run ruff check .Format the codebase:
uv run ruff format .Run type checks:
uv run ty check srcBuild and publish the ARM64 image:
docker buildx build --platform linux/arm64 \
-t curiousdima/caltrain-bot:0.1.1 \
-t curiousdima/caltrain-bot:latest \
--push .Required environment variables:
- Always:
TELEGRAM_BOT_TOKEN,LLM_PROVIDER - OpenRouter mode:
OPENROUTER_API_KEY,OPENROUTER_MODEL - Ollama mode:
OLLAMA_API_BASE,OLLAMA_MODEL
Example /opt/caltrain-bot/caltrain-bot.env for OpenRouter:
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
LLM_PROVIDER=openrouter
OPENROUTER_API_KEY=your-openrouter-api-key
OPENROUTER_MODEL=openai/gpt-4.1-miniDeploy on a Raspberry Pi:
docker run -d \
--name caltrain-bot \
--restart unless-stopped \
--env-file /opt/caltrain-bot/caltrain-bot.env \
curiousdima/caltrain-bot:latestExample /opt/caltrain-bot/caltrain-bot.env for Ollama on the same Raspberry Pi host:
TELEGRAM_BOT_TOKEN=your-telegram-bot-token
LLM_PROVIDER=ollama
OLLAMA_API_BASE=http://127.0.0.1:11434
OLLAMA_MODEL=llama3.2Deploy with host networking when Ollama runs directly on the Raspberry Pi host:
docker run -d \
--name caltrain-bot \
--restart unless-stopped \
--network host \
--env-file /opt/caltrain-bot/caltrain-bot.env \
curiousdima/caltrain-bot:latestsrc/caltrain_bot/__init__.py: CLI entrypoint that builds and runs the Telegram app.src/caltrain_bot/telegram_bot.py: Telegram bot handlers and user-facing message formatting.src/caltrain_bot/question_analysis.py: DSPy-based question classification and entity extraction.src/caltrain_bot/schedule.py: GTFS loading, SQL preprocessing, station lookup, and train queries.src/caltrain_bot/config.py: Environment validation and repo-relative asset paths.sql/train_stop_timeline.sql: SQL used to preprocess imported GTFS data into query-friendly tables.data/caltrain-ca-us.zip: Bundled Caltrain GTFS feed used by the app.tests/unit/: Unit tests.
Contributions are welcome. The current project focuses on Telegram, but the schedule and question-analysis logic are already separated enough to support additional chat frontends.
People are especially welcome to contribute WhatsApp and WeChat support. If you want to add another messaging integration, try to keep transport-specific code at the edge and reuse the existing schedule lookup and question parsing modules.