Skip to content

Rocketgraph/rocketgraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Rocketgraph

Rocketgraph πŸš€

Self-hosted log clustering and streaming anomaly detection that drops in next to the observability stack you already run.

What's in here Β β€’Β  Quick start Β β€’Β  Examples Β β€’Β  Website Β β€’Β  Community

Apache 2.0 Python Docker OpenTelemetry


Rocketgraph ML β€” 2M logs clustered into 58 templates in 90 seconds

Why?

Your monitoring tool tells you what you searched for. It rarely tells you what's unusual right now.

Rocketgraph sits next to whatever you already pay for β€” Datadog, New Relic, Loki, CloudWatch, Sentry, ClickHouse β€” pulls a window of logs, mines structural templates, and flags the anomalous ones. It runs entirely inside your network. Your logs never leave your VPC. There's no SaaS tier to pay for.

What's in here

Component What it does
🧠 ML engine Clusters logs into structural templates and detects anomalies. Pulls directly from your existing log source β€” no parallel ingest pipeline.
⚑ @rgraph/otel-node AI agent that auto-instruments any Node.js service with OpenTelemetry in ~90 seconds.

Try it in 90 seconds

git clone https://github.com/Rocketgraph/rocketgraph
cd rocketgraph/ml
cp .env.example .env             # fill in whichever sources you have
docker compose up --build        # β†’ http://localhost:9020

Point it at any source you already use:

curl 'http://localhost:9020/clusters?source=loki&window=1h'

Or skip the credentials entirely β€” download a log file and run it. Export from Datadog (CSV/JSON), kubectl logs > app.log, or any raw log, drop it in, and analyse it locally:

curl -XPOST 'http://localhost:9020/clusters/train?source=file'   # FILE_PATH=/data/app.log

See the one-command log-file quickstart.

That's the whole install. No schemas to provision, no accounts to create, no agents on hosts.

πŸ‘‰ Deep dive: ml/README.md for the ML engine Β· packages/otel-node for the OTel agent

How it works (30-second version)

Three deterministic algorithms in sequence β€” no LLM, no hallucination, fully reproducible:

  1. Drain3 mines structural templates from raw log lines.
  2. Isolation Forest scores templates per service to surface the unusual ones.
  3. Half-Space-Trees scores brand-new logs against the trained model in real time.

On a real production burst we test against: 2M logs β†’ 58 templates β†’ 9 anomalies, 90 seconds wall-clock, single container. Full details in ml/README.md.

Rocketgraph β€” find the anomaly hiding in your logs

Examples

Analyse a log file locally β€” analyze.py

The fastest way to see Rocketgraph work: drop a log file in ./logs/, run one command, and get a cluster table with the anomalies flagged. No accounts, no API keys, nothing leaves your machine. Add --ai for an optional Claude triage on top β€” the engine itself stays deliberately LLM-free and reproducible; the model only explains the deterministic clusters.

cd example-setups/logfile-quickstart

docker compose up --build -d            # ML engine on http://localhost:9020
python gen_sample_log.py                # or: cp ~/Downloads/whatever.log ./logs/file.log
pip install requests                    # anthropic too, if you'll use --ai

python analyze.py                       # table of all clusters
python analyze.py --anomalies-only      # just the flagged ones
python analyze.py --ai                  # table + AI triage
python analyze.py mylogs.log --ai       # a specific file

analyze.py auto-detects the file, points the engine at it, pulls the clusters, and prints them. ~15,000 raw lines collapse to ~11 structural templates; the brand-new "database failover" template β€” 8 lines, never seen before, error level β€” comes back flagged as an anomaly. No rules written, no labels:

15188 logs β†’ 11 clusters (3 anomalous)

  ANOM SERVICE        LOGS DEPTH  TEMPLATE
  ----------------------------------------
   *   payment-svc       8     3  Database failover: replica <*> promoted to primary after ...
   *   auth-svc       1573     2  Token refreshed for session <NUM>
       payment-svc    1686        Charge <NUM> authorized for $<FLOAT>
       ...

Reading the table: ANOM marks the clusters Isolation Forest flagged; LOGS is how many raw lines collapsed into that template; DEPTH is the isolation depth on anomalous clusters (lower = more anomalous); TEMPLATE is the structural pattern Drain3 mined. The flagged failover cluster is rare and new, which is exactly what surfaces it.

With --ai, the same clusters are handed to Claude for an SRE-style triage β€” likely incident, ranked root-cause hypotheses, and concrete next steps β€” grounded only in the clusters above. Full walkthrough in the log-file quickstart.

End-to-end reference apps

example-setups/ also contains reference apps you can point otel-node at to see the whole pipeline working β€” instrument the service, ship OTLP into your sink, then watch Rocketgraph cluster and flag the logs.

Example What it shows
bookstore-app Express + TypeScript service auto-instrumented by @rgraph/otel-node β€” the easiest way to see traces, metrics, and logs flowing into Rocketgraph end-to-end.

More examples (Fastify, NestJS, Next.js) are on the roadmap β€” PRs welcome.

Compatibility

Status Platforms
βœ… Supported Log file (.log/.json/.csv) Β· OpenTelemetry Β· Loki Β· New Relic Β· Datadog Β· CloudWatch Β· Sentry Β· ClickHouse
πŸ›£οΈ Roadmap Splunk Β· Elastic / OpenSearch Β· Azure Monitor Β· GCP Cloud Logging

Community

  • πŸ’¬ Discord β€” support and design discussions
  • πŸ› GitHub Issues β€” bugs and feature requests
  • 🐦 @RGraphql β€” release notes

Contributing

PRs welcome. The most impactful contributions right now:

  • New ML connectors (Splunk, OpenSearch, Azure Monitor, GCP Cloud Logging)
  • Additional framework support in @rgraph/otel-node (Fastify, NestJS, Remix, Bun-native services)
  • More end-to-end reference apps under example-setups/

See ml/README.md and packages/otel-node for the deep-dive docs.

License

Apache 2.0. See LICENSE.


Self-hosted. Open source. Drops in next to what you already run.
rocketgraph.app

Packages

 
 
 

Contributors