Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -451,6 +451,19 @@ See issues or open a PR!

---

## 🛠️ Services & Operations

NLBT now includes containerized ingest/enrich/detect/report services with observability,
storage migrations, and runbooks:

- Dockerfiles: `services/ingest`, `services/enrich`, `services/detect`, `services/report`
- Configuration: `docs/configuration.md`
- Runbooks: `docs/runbooks.md`
- Migrations: `storage/migrations`
- Backup/restore: `scripts/db_backup.sh`, `scripts/db_restore.sh`

---

## 📄 License

GPL-3.0 License. See `LICENSE`.
Expand Down Expand Up @@ -500,4 +513,4 @@ This project implements several **Agentic Design Patterns**:

See `cursor_chats/Agentic_Design_Patterns_Complete.md` for detailed documentation.

</details>
</details>
41 changes: 41 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Configuration

## Service endpoints

| Service | Endpoint | Description |
| --- | --- | --- |
| Ingest | `POST /ingest` | Pull recent changes from Wikimedia and normalize payloads. |
| Enrich | `POST /enrich` | Add metadata to ingest payloads. |
| Detect | `POST /detect` | Evaluate enriched data for signals. |
| Report | `POST /report` | Generate report records. |
| Metrics | `GET /metrics` | Prometheus metrics for each service. |
| Health | `GET /healthz` | Liveness probe. |
| Ready | `GET /readyz` | Readiness probe. |

## Environment variables

### Shared

- `LOG_LEVEL` (default: `INFO`)
- `DATABASE_URL` (required for migrations/backups)
- `RETENTION_DAYS` (default: `30`)

### Wikimedia API guards (ingest service)

- `WIKIMEDIA_BASE_URL` (default: `https://en.wikipedia.org`)
- `WIKIMEDIA_RATE_LIMIT` (default: `60`) — max requests per window.
- `WIKIMEDIA_RATE_WINDOW` (default: `60`) — seconds per window.
- `WIKIMEDIA_FAILURE_THRESHOLD` (default: `5`) — failures before circuit opens.
- `WIKIMEDIA_RECOVERY_SECONDS` (default: `30`) — cooldown before retry.

## Retention windows

- Ingest/enrich/detect/report records should be retained for `RETENTION_DAYS`.
- Recommended defaults:
- `30` days for ingest/enrich/detect.
- `90` days for reports.

## Thresholds

- Circuit breaker opens after `WIKIMEDIA_FAILURE_THRESHOLD` failures.
- Rate limiter defaults to `60` requests per minute.
41 changes: 41 additions & 0 deletions docs/runbooks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Runbooks

## Service restart

**When to use**: deployment rollouts, config changes, or a stuck worker.

1. Confirm readiness endpoints return healthy:
- `GET /healthz` and `GET /readyz`.
2. Drain traffic (remove instance from load balancer).
3. Restart the container:
- `docker restart <container>`
4. Validate metrics are flowing:
- `GET /metrics` returns `nlbt_requests_total`.
5. Re-add instance to the load balancer.

## Data backfill

**When to use**: missed ingest window, replay upstream data, or reprocessing.

1. Ensure storage is available and migrations are current:
- `scripts/db_migrate.sh`
2. Temporarily raise retention window if needed (see configuration doc).
3. Run the backfill job by calling ingest with a backfill flag (or replay from storage):
- `POST /ingest` with `{"mode": "backfill", "range": "<start>/<end>"}`
4. Monitor enrich/detect/report pipelines:
- Check logs for `enrich_received`, `detect_received`, `report_received`.
5. Once complete, reset retention overrides and verify downstream counts.

## Backup/restore

**Backup**

```bash
DATABASE_URL=postgres://... BACKUP_PATH=backup.dump scripts/db_backup.sh
```

**Restore**

```bash
DATABASE_URL=postgres://... BACKUP_PATH=backup.dump scripts/db_restore.sh
```
12 changes: 12 additions & 0 deletions env.example
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,15 @@ MAX_AGENT_ITERATIONS=20

# Optional: Override default model per session
# LLM_MODEL_OVERRIDE=gpt-4o

# Service Configuration
LOG_LEVEL=INFO
DATABASE_URL=postgres://user:pass@localhost:5432/nlbt
RETENTION_DAYS=30

# Wikimedia API guards (ingest service)
WIKIMEDIA_BASE_URL=https://en.wikipedia.org
WIKIMEDIA_RATE_LIMIT=60
WIKIMEDIA_RATE_WINDOW=60
WIKIMEDIA_FAILURE_THRESHOLD=5
WIKIMEDIA_RECOVERY_SECONDS=30
5 changes: 4 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,11 @@ dependencies = [
"matplotlib",
"markdown",
"weasyprint",
"fastapi",
"httpx",
"prometheus_client",
"uvicorn",
]

[project.scripts]
nlbt = "nlbt.cli:main"

9 changes: 9 additions & 0 deletions scripts/db_backup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail

DATABASE_URL=${DATABASE_URL:?"DATABASE_URL is required"}
BACKUP_PATH=${BACKUP_PATH:-backup_$(date +%Y%m%d_%H%M%S).dump}

pg_dump --format=custom --file "$BACKUP_PATH" "$DATABASE_URL"

echo "Backup written to $BACKUP_PATH"
6 changes: 6 additions & 0 deletions scripts/db_migrate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/usr/bin/env bash
set -euo pipefail

DATABASE_URL=${DATABASE_URL:?"DATABASE_URL is required"}

psql "$DATABASE_URL" -f storage/migrations/001_init.sql
9 changes: 9 additions & 0 deletions scripts/db_restore.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail

DATABASE_URL=${DATABASE_URL:?"DATABASE_URL is required"}
BACKUP_PATH=${BACKUP_PATH:?"BACKUP_PATH is required"}

pg_restore --clean --if-exists --dbname "$DATABASE_URL" "$BACKUP_PATH"

echo "Restored from $BACKUP_PATH"
15 changes: 15 additions & 0 deletions services/detect/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM python:3.11-slim

WORKDIR /app

COPY pyproject.toml README.md /app/
COPY src /app/src

RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -e .

ENV PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["uvicorn", "nlbt.services.detect:app", "--host", "0.0.0.0", "--port", "8000"]
15 changes: 15 additions & 0 deletions services/enrich/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM python:3.11-slim

WORKDIR /app

COPY pyproject.toml README.md /app/
COPY src /app/src

RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -e .

ENV PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["uvicorn", "nlbt.services.enrich:app", "--host", "0.0.0.0", "--port", "8000"]
15 changes: 15 additions & 0 deletions services/ingest/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM python:3.11-slim

WORKDIR /app

COPY pyproject.toml README.md /app/
COPY src /app/src

RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -e .

ENV PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["uvicorn", "nlbt.services.ingest:app", "--host", "0.0.0.0", "--port", "8000"]
15 changes: 15 additions & 0 deletions services/report/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
FROM python:3.11-slim

WORKDIR /app

COPY pyproject.toml README.md /app/
COPY src /app/src

RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir -e .

ENV PYTHONUNBUFFERED=1

EXPOSE 8000

CMD ["uvicorn", "nlbt.services.report:app", "--host", "0.0.0.0", "--port", "8000"]
1 change: 1 addition & 0 deletions src/nlbt/services/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"""Service entrypoints for NLBT microservices."""
23 changes: 23 additions & 0 deletions src/nlbt/services/detect.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from __future__ import annotations

import logging

from fastapi import FastAPI

from nlbt.services.observability import add_observability, configure_logging

SERVICE_NAME = "detect"
configure_logging(SERVICE_NAME)
logger = logging.getLogger("nlbt.services.detect")

app = FastAPI(title="NLBT Detect Service")


@app.post("/detect")
def detect(payload: dict) -> dict:
logger.info("detect_received", extra={"context": {"keys": list(payload.keys())}})
findings = [{"rule": "placeholder", "severity": "low"}]
return {"status": "ok", "findings": findings}


add_observability(app, SERVICE_NAME)
22 changes: 22 additions & 0 deletions src/nlbt/services/enrich.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
from __future__ import annotations

import logging

from fastapi import FastAPI

from nlbt.services.observability import add_observability, configure_logging

SERVICE_NAME = "enrich"
configure_logging(SERVICE_NAME)
logger = logging.getLogger("nlbt.services.enrich")

app = FastAPI(title="NLBT Enrich Service")


@app.post("/enrich")
def enrich(payload: dict) -> dict:
logger.info("enrich_received", extra={"context": {"keys": list(payload.keys())}})
return {"status": "ok", "enriched": payload}


add_observability(app, SERVICE_NAME)
44 changes: 44 additions & 0 deletions src/nlbt/services/ingest.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
from __future__ import annotations

import logging
import os

from fastapi import FastAPI, HTTPException

from nlbt.services.observability import add_observability, configure_logging
from nlbt.services.wikimedia import CircuitBreaker, RateLimiter, WikimediaClient

SERVICE_NAME = "ingest"
configure_logging(SERVICE_NAME)
logger = logging.getLogger("nlbt.services.ingest")

app = FastAPI(title="NLBT Ingest Service")

_rate_limiter = RateLimiter(
max_requests=int(os.getenv("WIKIMEDIA_RATE_LIMIT", "60")),
per_seconds=float(os.getenv("WIKIMEDIA_RATE_WINDOW", "60")),
)
_circuit_breaker = CircuitBreaker(
failure_threshold=int(os.getenv("WIKIMEDIA_FAILURE_THRESHOLD", "5")),
recovery_seconds=float(os.getenv("WIKIMEDIA_RECOVERY_SECONDS", "30")),
)
_wikimedia_client = WikimediaClient(
base_url=os.getenv("WIKIMEDIA_BASE_URL", "https://en.wikipedia.org"),
rate_limiter=_rate_limiter,
circuit_breaker=_circuit_breaker,
)


@app.post("/ingest")
def ingest() -> dict:
try:
data = _wikimedia_client.fetch_recent_changes()
except RuntimeError as exc:
raise HTTPException(status_code=503, detail=str(exc)) from exc
except Exception as exc: # noqa: BLE001
logger.exception("ingest_failed")
raise HTTPException(status_code=502, detail="Failed to fetch Wikimedia data") from exc
return {"status": "ok", "items": data.get("query", {}).get("recentchanges", [])}


add_observability(app, SERVICE_NAME)
Loading