Skip to content

Expand metrics: per-org/cache labels, histograms #212

@DerDennisOP

Description

@DerDennisOP

Tracks scope intentionally deferred from the MVP /metrics endpoint shipped in #35 / #211.

The MVP exposes flat counters and gauges with the minimum useful set of labels: build/evaluation counts by status, scheduler queue depth, connected workers, and cache totals. The work below requires either new internal data collection or non-trivial label-dimension design.

Scope

  • Per-org / per-cache / per-project label dimensions. Most of the above metrics would be more useful broken down by tenant. Requires:
    • A label allowlist policy so cardinality cannot explode (e.g. opt-in per org, or top-N orgs only).
    • DB query rewrites to GROUP BY org/cache/project where applicable.
  • Build-duration and evaluation-duration histograms. Use the existing `build.build_time_ms` and `(updated_at - created_at)` for evals. Histograms via `prometheus::Histogram` with sane bucket boundaries (e.g. exponential 1s → 1h).
  • HTTP request-duration histogram. Extend the existing `TraceLayer` to record per-route latency keyed by `MatchedPath`.
  • Worker-side metrics:
    • Peer-to-peer transfer bytes/requests
    • Concurrent build slot utilisation (`assigned_jobs.len()` / `max_concurrent_builds`)
    • Build queue wait times
  • Process / runtime metrics: RSS, fd count, tokio task count. Most easily via the `prometheus` crate's optional process collector (Linux only).
  • Per-cache traffic rate (not just totals) and storage growth rate broken down by cache. Source data is already in `cache_metric` — this is presentation, but with cardinality concerns.

Out of scope for this issue

  • A bundled Grafana dashboard (separate issue if desired)

Driven by the conversation in #35.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestlowLow severity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions