Self-hosted log clustering and streaming anomaly detection that drops in next to the observability stack you already run.
What's in here Β β’Β Quick start Β β’Β Examples Β β’Β Website Β β’Β Community
Your monitoring tool tells you what you searched for. It rarely tells you what's unusual right now.
Rocketgraph sits next to whatever you already pay for β Datadog, New Relic, Loki, CloudWatch, Sentry, ClickHouse β pulls a window of logs, mines structural templates, and flags the anomalous ones. It runs entirely inside your network. Your logs never leave your VPC. There's no SaaS tier to pay for.
| Component | What it does |
|---|---|
| π§ ML engine | Clusters logs into structural templates and detects anomalies. Pulls directly from your existing log source β no parallel ingest pipeline. |
β‘ @rgraph/otel-node |
AI agent that auto-instruments any Node.js service with OpenTelemetry in ~90 seconds. |
git clone https://github.com/Rocketgraph/rocketgraph
cd rocketgraph/ml
cp .env.example .env # fill in whichever sources you have
docker compose up --build # β http://localhost:9020Point it at any source you already use:
curl 'http://localhost:9020/clusters?source=loki&window=1h'Or skip the credentials entirely β download a log file and run it. Export from Datadog (CSV/JSON), kubectl logs > app.log, or any raw log, drop it in, and analyse it locally:
curl -XPOST 'http://localhost:9020/clusters/train?source=file' # FILE_PATH=/data/app.logSee the one-command log-file quickstart.
That's the whole install. No schemas to provision, no accounts to create, no agents on hosts.
π Deep dive: ml/README.md for the ML engine Β· packages/otel-node for the OTel agent
Three deterministic algorithms in sequence β no LLM, no hallucination, fully reproducible:
- Drain3 mines structural templates from raw log lines.
- Isolation Forest scores templates per service to surface the unusual ones.
- Half-Space-Trees scores brand-new logs against the trained model in real time.
On a real production burst we test against: 2M logs β 58 templates β 9 anomalies, 90 seconds wall-clock, single container. Full details in ml/README.md.
The fastest way to see Rocketgraph work: drop a log file in ./logs/, run one
command, and get a cluster table with the anomalies flagged. No accounts, no API
keys, nothing leaves your machine. Add --ai for an optional Claude triage on
top β the engine itself stays deliberately LLM-free and reproducible; the model
only explains the deterministic clusters.
cd example-setups/logfile-quickstart
docker compose up --build -d # ML engine on http://localhost:9020
python gen_sample_log.py # or: cp ~/Downloads/whatever.log ./logs/file.log
pip install requests # anthropic too, if you'll use --ai
python analyze.py # table of all clusters
python analyze.py --anomalies-only # just the flagged ones
python analyze.py --ai # table + AI triage
python analyze.py mylogs.log --ai # a specific fileanalyze.py auto-detects the file, points the engine at it, pulls the clusters,
and prints them. ~15,000 raw lines collapse to ~11 structural templates; the
brand-new "database failover" template β 8 lines, never seen before, error
level β comes back flagged as an anomaly. No rules written, no labels:
15188 logs β 11 clusters (3 anomalous)
ANOM SERVICE LOGS DEPTH TEMPLATE
----------------------------------------
* payment-svc 8 3 Database failover: replica <*> promoted to primary after ...
* auth-svc 1573 2 Token refreshed for session <NUM>
payment-svc 1686 Charge <NUM> authorized for $<FLOAT>
...
Reading the table: ANOM marks the clusters Isolation Forest flagged; LOGS
is how many raw lines collapsed into that template; DEPTH is the isolation
depth on anomalous clusters (lower = more anomalous); TEMPLATE is the
structural pattern Drain3 mined. The flagged failover cluster is rare and new,
which is exactly what surfaces it.
With --ai, the same clusters are handed to Claude for an SRE-style triage β
likely incident, ranked root-cause hypotheses, and concrete next steps β grounded
only in the clusters above. Full walkthrough in the
log-file quickstart.
example-setups/ also contains reference apps you can point
otel-node at to see the whole pipeline working β instrument the service, ship
OTLP into your sink, then watch Rocketgraph cluster and flag the logs.
| Example | What it shows |
|---|---|
bookstore-app |
Express + TypeScript service auto-instrumented by @rgraph/otel-node β the easiest way to see traces, metrics, and logs flowing into Rocketgraph end-to-end. |
More examples (Fastify, NestJS, Next.js) are on the roadmap β PRs welcome.
| Status | Platforms |
|---|---|
| β Supported | Log file (.log/.json/.csv) Β· OpenTelemetry Β· Loki Β· New Relic Β· Datadog Β· CloudWatch Β· Sentry Β· ClickHouse |
| π£οΈ Roadmap | Splunk Β· Elastic / OpenSearch Β· Azure Monitor Β· GCP Cloud Logging |
- π¬ Discord β support and design discussions
- π GitHub Issues β bugs and feature requests
- π¦ @RGraphql β release notes
PRs welcome. The most impactful contributions right now:
- New ML connectors (Splunk, OpenSearch, Azure Monitor, GCP Cloud Logging)
- Additional framework support in
@rgraph/otel-node(Fastify, NestJS, Remix, Bun-native services) - More end-to-end reference apps under
example-setups/
See ml/README.md and packages/otel-node for the deep-dive docs.
Apache 2.0. See LICENSE.
Self-hosted. Open source. Drops in next to what you already run.
rocketgraph.app

